For example, in this article it says it can't do coding exercises outside the training set. That would definitely be on the "AGI checklist". Basically doing anything that is outside of the training set would be on that list.
I will get excited for/scared of LLMs when they can tackle this kind of problem. But I don't believe they can because of the fundamental nature of their design, which is both backward looking (thus not better than the human state of the art) and lacks human intuition and self awareness. Or perhaps rather I believe that the prompt that would be required to get an LLM to produce such a program is a problem of at least equivalent complexity to implementing the program without an LLM.
That’s possible for a highly intelligent, extensively trained, very small subset of humans.
That also ignores the fact that the small set of humans capable of building programming languages and compilers is a consequence of specialization and lack of interest. There are plenty of humans that are capable of learning how to do it. LLMs, on the other hand, are both specialized for the task and aren't lazy or uninterested.
I've personally had some mild success getting these UTM variants to output their own children in a meta programming arrangement. The base program only has access to the valid instruction set of ~12 instructions per byte, while the task program has access to the full range of instructions and data per byte (256). By only training the base program, we reduce the search space by a very substantial factor. I think this would be similar to the idea of a self-hosted compiler, etc. I don't think there would be too much of a stretch to give it access to x86 instructions and a full VM once a certain amount of bootstrapping has been achieved.
[0]: https://arxiv.org/abs/2406.19108
Things like drive a car, fold laundry, run an errand, do some basic math.
You'll notice that two of those require some form of robot or mobility. I think that is key -- you can't have AGI without the ability to interact with the world in a way similar to most humans.
There, you don't need to invoke Turing or compiler bootstrapping. You just need one example of a use case where the accuracy of responses is mission critical
https://chatgpt.com/share/67373737-04a8-800d-bc57-de74a415e2...
I think the parent comment's challenge is more appropriate.
A crucial element of AGI would be the ability to self-train on self-generated data, online. So it's not really AGI if there is a hard distinction between training and inference (though it may still be very capable), and it's not really AGI if it can't work its way through novel problems on its own.
The ability to immediately solve a problem it's never seen before is too high a bar, I think.
And yes, my definition still excludes a lot of humans in a lot of fields. That's a bullet I'm willing to bite.
(That’s not to say that humans don’t tend to lose some of their flexibility over their individual lifetimes as well.)
That's not true. Humans can learn.
An LLM is just a tool. If it can't do what you want then too bad.
Depends on how you define “self awareness” but knowing that it doesn't know something instead of hallucinating a plausible-but-wrong is already self awareness of some kind. And it's both highly valuable and beyond current tech's capability.
https://openai.com/index/introducing-simpleqa/
especially this section Using SimpleQA to measure the calibration of large language models
I'm wondering wether it would count, if one would extend it with an external program, that gives it feedback during inference (by another prompt) about the correctness of it's output.
I guess it wouldn't, because these RAG tools kind of do that and i heard no one calling those self aware.
If you have an external program, then by defining it's not self-awareness ;). Also, it's not about correctness per se, but about the model's ability to assess its own knowledge (making a mistake because the model was exposed to mistakes in the training data is fine, hallucinating isn't).
That is definitely an ability that current LLMs lack.
https://plato.stanford.edu/entries/chinese-room/
The idea that "human-like" behaviour will lead to self-awareness is both unproven (it can't be proven until it happens) and impossible to disprove (like Russell's teapot).Yet, one common assumption of many people running these companies or investing in them, or of some developers investing their time in these technologies, is precisely that some sort of explosion of superintelligence is likely, or even inevitable.
It surely is possible, but stretching that to likely seems a bit much if you really think how imperfectly we understand things like consciousness and the mind.
Of course there are people who have essentially religious reactions to the notion that there may be limits to certain domains of knowledge. Nonetheless, I think that's the reality we're faced with here.
I think Searle's view was that:
- while it cannot be dis-_proven_, the Chinese Room argument was meant to provide reasons against believing it
- the "it can't be proven until it happens" part is misunderstanding: you won't know if it happens because the objective, externally available attributes don't indicate whether self-awareness (or indeed awareness at all) is present
> while it cannot be dis-_proven_, the Chinese Room argument was meant to provide reasons against believing it
Yes, like Russell's teapot. I also think that's what Searle means.
> the "it can't be proven until it happens" part is misunderstanding: you won't know if it happens because the objective, externally available attributes don't indicate whether self-awareness (or indeed awareness at all) is present
Yes, agreed, I believe that's what Searle is saying too. I think I was maybe being ambiguous here - I wanted to say that even if you forgave the AI maximalists for ignoring all relevant philosophical work, the notion that "appearing human-like" inevitably tends to what would actually be "consciousness" or "intelligence" is more than a big claim.
Searle goes further, and I'm not sure if I follow him all the way, personally, but it's a side point.