Does chatGPT need a way to verify reality for itself to become truly intelligent ?
The key difference appears a combination of two factors: We appear to be more likely to have learnt which subjects we're not very knowledgeable about through extensive feedback, be it through school or conversations where we're told we're wrong, and which also would appear to teach us to be more cautious in general. We also appear to have a (somewhat; far from perfect) better ability to separate memory from thoughts about our knowledge. We certainly can go off on wild tangents and make stuff up about any subject, but we get reinforced from very young that there's a time and a place for making stuff up vs. drawing on memory.
Both goes back simply to extensive (many years of) reinforcement telling us it has negative effects to make stuff up and/or believe things that aren't true, and yet we still do both of those, just not usually as blatantly as current LLMs without being aware.
So I'd expect one missing component is to add a training step that subjects the model to batteries of tests of the limits of their knowledge and incorporating the results of that in the training.
In other words, the human has a concept of truth or facts and simply had a memory lapse. This thing has no concept of truth at all.
Agreed. I'd like to add another point to the discussion. It seems to me, as if LLMs are held to a higher standard regarding telling the truth than humans are. In my opinion the reason for this is that computers have been traditionally used to solve deterministic tasks and that people are not used to them making wrong claims.
Imagine subjecting a random human to the same battery of conversations and judging the truthfulness of their answers.
Now, imagine doing the same to a child too young to have had many years of reinforcement of the social consequences of not clearly distinguishing fantasy from perceived truth.
I do think a human adult would (still) be likely to be overall better at distinguishing truth from fiction when replying, but I'm not at all confident that a human child would.
I think LLMs will need more reinforcement from probing the limits of their knowledge to make it easier to rely on their responses, but I also think one of the reasons people hold LLMs to the standard they do is also that they "sound" knowledgeable. If ChatGPT spoke like a 7 year old, nobody would take issue with it making a lot of stuff up. But since ChatGPT is more eloquent than most adults, it's easy to expect it to behave like a human adult. LLMs have gaps that are confusing to us because the signs we tend to go by to judge someones intelligence are not reliable with LLMs.
Paradoxically, it seems as if the people who are pushing this the hardest are the same people who flat out deny even the slightest flicker of what could be considered intelligence.
Saying “most humans make dumb statements” is not a defense to build a machine that makes dumb statements imo
If we made a ChatGPT that has some “senses” it can use to verify its perceptions, and it does as well with this as humans generally do, I’m sure we’ll get interesting results but I’m not sure we’ll be any closer to solving the problem.
1) My confidence was around 80%, without any sense of absolute certainty. 2) I had no idea why I knew that, and where did I originally learned about what a house centipede looks like.
There's a good argument to be made you learned about potentially dangerous insects as a child, from books aimed at children. A reasonable person would make this argument. It's also likely that you retained that information because your would-be peers who aren't around, aren't around because their ascendants inserted their genetic material in to the humanity's descent's to a lesser extent than your extant peers.
You might not be able to immediately recall where and when you learned something, and that shouldn't be equated with genuinely having no idea.
I'd settle for a poor fascimillie, and argue strongly against human-like.
Human-like would be, let's say, a Cylon skin-job. ChatGPT is just a fraking toaster, at best.
But I agree with GP's eager-anthropomorphization complaint. When algorithms produce verifiably wrong output we call them errors.
Hallucinations are a mistake in perception.