I don't claim or believe that any LLM is actually intelligent. It just seems that we (at least on an individual basis) can also meet the criteria outlined above. I know plenty of people who are confidently incorrect and appear unwilling to learn or accept their own limitations, myself included.
In my opinion, even if we did have AGI it would still exhibit a lot of our foibles given that we'd be the only ones teaching it.
I feel like if you have any belief in philosophy then LLMs can only be interpreted as a parlour trick (on steroids). Perhaps we are fanciful in believing we are something greater than LLMs but there is the idea that we respond using rhetoric based on trying to find reason within in what we have learned and observed. From my primitive understanding, LLMs rhetoric and reasoning is entirely implied based on an effectively (compared to the limitations of human capacity to store information) infinite amount of knowledge they've consumed.
I think if LLMs were equivalent to human thinking then we'd all be a hell of a lot stupider, given our lack of "infinite" knowledge compared to LLMs.
In humans “hallucination” means observing false inputs. In GTP it means creating false outputs.
Completely different with massively different connotations.
"Hallucination" is a term that works well for actual intelligence - when you "know" something that isn't true, and has no path of reasoning, you might have hallucinated the base "knowledge".
But that doesn't really work for LLMs, because there's no knowledge at all. All they're doing is picking the next most likely token based on the probabilities. If you interrogate something that the training data covers thoroughly, you'll get something that is "correct", and that's to be expected because there's a lot of probabilities pointing to the "next token" being the right one... but as you get to the edge of the training data, the "next token" is less likely to be correct.
As a thought experiment, imagine that you're given a book with every possible or likely sequence of coloured circles, triangles, and squares. None of them have meaning to you, they're just colours and shapes that are in random seeming sequences, but there's a frequency to them. "Red circle, blue square, gren triangle" is a much more common sequence than "red circle, blue square, black triangle", so if someone hands you a piece of paper with "red circle, blue square", you can reasonably guess that what they want back is a green triangle.
Expand the model a bit more, and you notice that "rc bs gt" is pretty common, but if there's a yellow square a few symbols before with anything in between, then the triangle is usually black. Thus the response to the sequence "red circle, blue square" is usually "green triangle", but "black circle, yellow square, grey circle, red circle, blue square" is modified by the yellow square, and the response is "black triangle"... but you still don't know what any of these things _mean_.
When you get to a sequence that isn't covered directly by the training data, you just follow the process with the information that you _do_ have. You get "red triangle, blue square" and while you've not encountered that sequence before, "green" _usually_ comes after "red, blue", and "circle" is _usually_ grouped with "triangle, square", so a reasonable response is "green circle"... but we don't know, we're just guessing based on what we've seen.
That's the thing... the process is exactly the same whether the sequence has been seen before or not. You're not _hallucinating_ the green circle, you're just picking based on probabilities. LLMs are doing effectively this, but at massive scale with an unthinkably large dataset as training data. Because there's so much data of _humans talking to other humans_, ChatGPT has a lot of probabilities that make human-sounding responses...
It's not an easy concept to get across, but there's a fundamental difference between "knowing a thing and being able to discuss it" and "picking the next token based on the probabilities gleaned from inspecting terabytes of text, without understanding what any single token means"
An LLM doesn't have that. It's very impressive parlour trick (and of course a lot more), but it's use is hence limited (albeit massive) to that.
Chaining and context assists resolving that to some extent, but it's a limited extent.
That's the argument anyway, that doesn't mean it's not incredibly impressive, but comparing it to human self-awareness, however small, isn't a fair comparison.
It's next token prediction, which is why it does classification so well.