undefined | Better HN

0 pointsRhapso1y ago0 comments

"Hallucination" isn't really a problem that can be "fixed". Its just model error.

The root problem is simply that the model doesn't capture reality, just an approximation. What we are incorrectly calling "hallucination" is just the best the model has to offer.

0 comments

20 comments · 3 top-level

dilap1y ago· 13 in thread

it can be fixed in theory if the model knows-what-it-knows, to avoid saying things its uncertain about (this is what (some) humans do to reduce the frequency w which they say untrue things).

theres some promising research using this idea, tho i dont have it at hand.

hoosieree1y ago

LLMs can't hallucinate. They generate the next most likely token in a sequence. Whether that sequence matches any kind of objective truth is orthogonal to how models work.

I suppose depending on your point of view, LLMs either can't hallucinate, or that's all they can do.

ToValueFunfetti1y ago

>Whether that sequence matches any kind of objective truth is orthogonal to how models work.

Empirically, this cannot be true. If it were, it would be statistically shocking how often models coincidentally say true things. The training does not perfectly align the model with truth, but 'orthogonal' is off by a minimum of 45 degrees.

1 more reply

CooCooCaCha1y ago

Whenever someone takes issue with using the word “hallucinate” with LLMs I get the impression they’re trying to convince me that hallucination is good.

Why do you care so much about this particular issue? And why can’t hallucination be something we can aim to improve?

AnimalMuppet1y ago

I'm pretty sure there's something I don't understand, but:

Doesn't an LLM pick the "most probable next symbol" (or, depending on temperature, one of the most probable next symbols)? To do that, doesn't it have to have some idea of what the probability is? Couldn't it then, if the probability falls below some threshold, say "I don't know" instead of giving what it knows is a low-probability answer?

dTal1y ago

It doesn't really work like that.

1) The model outputs a ranked list of all tokens; the probability always sums to 1. Sometimes there is a clear "#1 candidate", very often there are a number of plausible candidates. This is just how language works - there are multiple ways to phrase things, and you can't have the model give up every time there is a choice of synonyms.

2) Probability of a token is not the same as probability of a fact. Consider a language model that knows the approximate population of Paris (2 million) but is not confident about the exact figure. Feed such a model the string "The exact population of Paris is" and it will begin with "2" but halfway through the number it will have a more or less arbitrary choice of 10 digits. "2.1I don't know" is neither a desirable answer, nor a plausible one from the model's perspective.

darkPotato1y ago

My understanding is that the hallucination is, out of all the possibilities, the most probable one (ignoring temperature). So the hallucination is the most probable sequence of tokens at that point. The model may be able to predict an "I don't have that information" given the right context. But ensuring that in general is an open question.

viraptor1y ago

> Doesn't an LLM pick the "most probable next symbol"

Yes, but that very rarely matters. (Almost never when it's brought up in discussions)

> Couldn't it then, if the probability falls below some threshold, say "I don't know" instead of giving what it knows is a low-probability answer?

A low probability doesn't necessarily mean something's incorrect. Responding to your question in French would also have very low probability, even if it's correct. There's also some nuance around what's classified as a hallucination... Maybe something in the training data did suggest that answer as correct.

There are ideas similar to this one though. It's just a bit more complex than pure probabilities going down. https://arxiv.org/abs/2405.19648

1 more reply

anon2911y ago

You need to separate out the LLM, which only produces a set of probabilities, from the system, which includes the LLM and the sampling methodology. Sampling is currently not very intelligent at all.

The next bit of confusion is that the 'probability' isn't 'real'. It's not an actual probability but a weight that sums up to one, which is close enough to how probability works that we call it that. However, sometimes there are several good answers and so all the good answers get a lower probability because there are 5 of them. A fixed threshold is not a good idea in this case. Instead, smarter sampling methods are necessary. One possibility is that if we do have seeming confusion, to put a 'confusion marker' into the text and predict the next output and train models to refine the answer as they go along. Not sure if any work has been done here, but this seems to go along with what you're interested in

1 more reply

ithkuil1y ago

This may work when the next token is a key concept but when it's a filler word or a part of one of many sequences of words that can convey the same meaning but in different ways (synonyms but not only at the word also at the sentence levels) then it's harder to know whether the probability is low because the word is absolutely unlikely or because it's likelihood is spread/shared among other truthful statements

skydhash1y ago

You would need some kind of referential facts that you hold as true, then some introspection method to align sentences to those. if it can’t be done, the output may be “I don’t know”. But even for programming languages (simplest useful languages), it would be hard to do.

PaulHoule1y ago

My guess is the problem is words with high probabilities that happen to be part of a wrong answer.

For one thing the probability of a word occurring is just a probability of the word occurring in a certain sample, it's not an indicator of truth. (e.g. the most problematic concept in philosophy in that just introducing it undermines the truth, see "9/11 truther") It's also not sufficient to pick a "true" word or always pick a "true" word but rather the truthfulness of a statement needs to be evaluated based on the statement as a whole.

A word might have a low probability because it competes with a large number of alternatives that are equally likely which is not a reason to stop generation.

visarga1y ago

This reminds me it's easy to train similarity models, hard to train identity/equivalence prediction. Two strings can be similar in many ways, like "Address Line 1" and "Address Line 2" or "Position_X" and "Position_Y", yet distinct in meaning. That one character makes all the difference. On the other hand "Vendor Name" is equivalent with "Seller Company" even though they are pretty different lexically.

The dot product, which is at the core of attention, is good for similarity not identity. I think this is why models hallucinate - how can they tell the distinction between "I have trained on this fact" and "Looks like something I trained on".

atrus1y ago

I don't think that fixes it, even in theory, since there's always some uncertainty.

spencerchubb1y ago· 4 in thread

it's not "just" model error

during pre-training, there is never an incentive for the model to say "I don't know" because it would be penalized. the model is incentivized to make an educated guess

large transformer models are really good at approximating their dataset. there is no data on the internet about what LLMs know. and even if there were such data, it would probably become obsolete soon

that being said, maybe a big shift in the architecture could solve this. I hope!

happypumpkin1y ago

> it would probably become obsolete soon

Suppose there are many times more posts about something one generation of LLMs can't do (arithmetic, tic-tac-toe, whatever), than posts about how the next generation of models can do that task successfully. I think this is probably the case.

While I doubt it will happen, it would be somewhat funny if training on that text caused a future model to claim it can't do something that it "should" be able to because it internalized that it was an LLM and "LLMs can't do X."

spencerchubb1y ago

also presumes that the LLM knows it is an LLM

2 more replies

singularity20011y ago

in another paper which popped up recently they approximated uncertainty with Entropy and inserted "wait!" tokens whenever Entropy was high, simulating chain of thought within the system.

spywaregorilla1y ago

> during pre-training, there is never an incentive for the model to say "I don't know" because it would be penalized. the model is incentivized to make an educated guess

The guess can be "I don't know". The base LLM would generally only say I don't know if it "knew" that it didn't know, which is not going to be very common. The tuned LLM would be the level responsible for trying to equate a lack of understanding to saying "I don't know"

tucnak1y ago

I'm led to believe this is mostly because "known unknowns" are not well-represented in the training datasets... I think, instead of bothering with refusals and enforcing a particular "voice" with excessive RL, they ought to focus more on identifying "gaps" in the datasets and feeding them back, perhaps they're already doing this with synthetic data / distillation.

j / k navigate · click thread line to collapse

0 comments

20 comments · 3 top-level

dilap1y ago· 13 in thread

it can be fixed in theory if the model knows-what-it-knows, to avoid saying things its uncertain about (this is what (some) humans do to reduce the frequency w which they say untrue things).

theres some promising research using this idea, tho i dont have it at hand.

hoosieree1y ago

LLMs can't hallucinate. They generate the next most likely token in a sequence. Whether that sequence matches any kind of objective truth is orthogonal to how models work.

I suppose depending on your point of view, LLMs either can't hallucinate, or that's all they can do.

ToValueFunfetti1y ago

>Whether that sequence matches any kind of objective truth is orthogonal to how models work.

1 more reply

CooCooCaCha1y ago

Whenever someone takes issue with using the word “hallucinate” with LLMs I get the impression they’re trying to convince me that hallucination is good.

Why do you care so much about this particular issue? And why can’t hallucination be something we can aim to improve?

AnimalMuppet1y ago

I'm pretty sure there's something I don't understand, but:

dTal1y ago

It doesn't really work like that.

darkPotato1y ago

viraptor1y ago

> Doesn't an LLM pick the "most probable next symbol"

Yes, but that very rarely matters. (Almost never when it's brought up in discussions)

> Couldn't it then, if the probability falls below some threshold, say "I don't know" instead of giving what it knows is a low-probability answer?

There are ideas similar to this one though. It's just a bit more complex than pure probabilities going down. https://arxiv.org/abs/2405.19648

1 more reply

anon2911y ago

You need to separate out the LLM, which only produces a set of probabilities, from the system, which includes the LLM and the sampling methodology. Sampling is currently not very intelligent at all.

1 more reply

ithkuil1y ago

skydhash1y ago

PaulHoule1y ago

My guess is the problem is words with high probabilities that happen to be part of a wrong answer.

A word might have a low probability because it competes with a large number of alternatives that are equally likely which is not a reason to stop generation.

visarga1y ago

atrus1y ago

I don't think that fixes it, even in theory, since there's always some uncertainty.

spencerchubb1y ago· 4 in thread

it's not "just" model error

during pre-training, there is never an incentive for the model to say "I don't know" because it would be penalized. the model is incentivized to make an educated guess

that being said, maybe a big shift in the architecture could solve this. I hope!

happypumpkin1y ago

> it would probably become obsolete soon

spencerchubb1y ago

also presumes that the LLM knows it is an LLM

2 more replies

singularity20011y ago

in another paper which popped up recently they approximated uncertainty with Entropy and inserted "wait!" tokens whenever Entropy was high, simulating chain of thought within the system.

spywaregorilla1y ago

> during pre-training, there is never an incentive for the model to say "I don't know" because it would be penalized. the model is incentivized to make an educated guess

tucnak1y ago

j / k navigate · click thread line to collapse