undefined | Better HN

0 pointsDrawTR1y ago0 comments

Can this be potentially dangerous -- e.g. if a user types "The answer to the expression 2 + 2 is", isn't there a chance it chooses an output beyond the most likely one?

0 comments

Habgdnv1y ago

Unless you screw something, a different next token does not mean wrong answer. Examples:

(80% of the time) The answer to the expression 2 + 2 is 4

(15% of the time) The answer to the expression 2 + 2 is Four

(5% of the time) The answer to the expression 2 + 2 is certainly

(95% of the time) The answer to the expression 2 + 2 is certainly Four

This is how you can asp ChatGPT the same question few times and it can give you different words each time, and still be correct.

weitendorf1y ago

That assumes that the model is assigning vanishingly small weights to truly incorrect answers, which doesn't necessarily hold up in practice. So I think "Unless you screw something" is doing a lot of work there

I think a more correct explanation would be that increasing temperature doesn't necessarily increase the probability of a truly incorrect answer proportionately to the temperature increase (because the same correct answer could be represented by many different sequences of tokens), but if the model assigns a non-zero value to any incorrect output after applying softmax (which it most likely does), increasing the temperature does increase the probability of that incorrect output being returned.

lesostep1y ago

I would guess that any mentioning of the Radiohead nearby would strongly influence answers, due to the famous "2 + 2 = 5" song. And if I understand correctly, then there is a chance that some tokens that are very close to the "Radiohead" tokens could also influence the answer.

So maybe something like "It's a well-known fact in the smith community that 2 + 2 =" could realistically come up with a "5" as a next token.

weitendorf1y ago

Yes, although it's also possible that the most likely token is incorrect and perhaps the next 4 most likely tokens would lead to a correct answer.

For example if you ask a model what is 0^0, the highest probability output may be "1", which is incorrect. The next most probable outputs may be words like "although", "because", "Due to", "unfortunately", etc. as the model prepares to explain to the user that the value of the expression is undefined; because there are many more ways to express and explain the undefined answer than there are to express a naively incorrect answer, the correct answer is split across more tokens so that even if eg the softmax value of "1" is 0.1 and across "although"+"because"+"due to"+"unfortunately">0.3, at temperature of 0, "1" gets chosen. At slightly higher temperatures, sampling across all outputs would increase the probability of a correct answer.

So it's true that increasing the temperature increases the probability that the model outputs tokens other than the single-most-likely token, but that might be what you want. Temperature purely controls the distribution of tokens, not "answers".

Newlaptop1y ago

Not sure if you were making a joke, but 0^0 is often defined as 1.

https://en.wikipedia.org/wiki/Zero_to_the_power_of_zero

weitendorf1y ago

I honestly had forgot that, if I ever knew it. But I think the point stands that in many contexts you'd rather have the nuances of this kind of thing explained to you - able to represented by many different sequences of tokens, each individually being low probability - instead of simply taking the single-highest probability token "1".

1 more reply

zild3d1y ago

perhaps a hallucination

x-complexity1y ago

> Can this be potentially dangerous -- e.g. if a user types "The answer to the expression 2 + 2 is", isn't there a chance it chooses an output beyond the most likely one?

This is where the semi-ambiguity of the human languages helps a lot with.

There are multiple ways to answer with "4" that are acceptable, meaning that it just needs to be close enough to the desired outcome to work. This means that there isn't a single point that needs to be precisely aimed at, but a broader plot of space that's relatively easier to hit.

The hefty tolerances, redundancies, & general lossiness of the human language act as a metaphorical gravity well to drag LLMs to the most probable answer.

MrYellowP1y ago

> potentially dangerous

> 2 + 2

You really couldn't come up with an actual example of something that would be dangerous? I'd appreciate that, because I'm not seeing reason to believe that an "output beyond the most likely one" output would end up ever being dangerous, as in, harming someone or putting someone's life at risk.

Thanks.

DrawTROP1y ago

There's no need for the snark there. I mean 'potential danger' as in the LLM outputting anything inconsistent with reality. That can be as simple as an arithmetic issue.

autoexec1y ago

That depends on how many people are putting blind faith in terrible AI. If it's your doctor or your parole board, AI making a mistake could be horrible for you.

jldugger1y ago

Yes, but the chance is quite small if the gap between "4" and any other token is quite large.

pelillian1y ago

That’s why we use top p and top k! They limit the probability space to a certain % or number of tokens ordered by likelihood

j / k navigate · click thread line to collapse

0 comments

Habgdnv1y ago

Unless you screw something, a different next token does not mean wrong answer. Examples:

(80% of the time) The answer to the expression 2 + 2 is 4

(15% of the time) The answer to the expression 2 + 2 is Four

(5% of the time) The answer to the expression 2 + 2 is certainly

(95% of the time) The answer to the expression 2 + 2 is certainly Four

This is how you can asp ChatGPT the same question few times and it can give you different words each time, and still be correct.

weitendorf1y ago

lesostep1y ago

So maybe something like "It's a well-known fact in the smith community that 2 + 2 =" could realistically come up with a "5" as a next token.

weitendorf1y ago

Yes, although it's also possible that the most likely token is incorrect and perhaps the next 4 most likely tokens would lead to a correct answer.

Newlaptop1y ago

Not sure if you were making a joke, but 0^0 is often defined as 1.

https://en.wikipedia.org/wiki/Zero_to_the_power_of_zero

weitendorf1y ago

1 more reply

zild3d1y ago

perhaps a hallucination

x-complexity1y ago

> Can this be potentially dangerous -- e.g. if a user types "The answer to the expression 2 + 2 is", isn't there a chance it chooses an output beyond the most likely one?

This is where the semi-ambiguity of the human languages helps a lot with.

The hefty tolerances, redundancies, & general lossiness of the human language act as a metaphorical gravity well to drag LLMs to the most probable answer.

MrYellowP1y ago

> potentially dangerous

> 2 + 2

Thanks.

DrawTROP1y ago

There's no need for the snark there. I mean 'potential danger' as in the LLM outputting anything inconsistent with reality. That can be as simple as an arithmetic issue.

autoexec1y ago

That depends on how many people are putting blind faith in terrible AI. If it's your doctor or your parole board, AI making a mistake could be horrible for you.

jldugger1y ago

Yes, but the chance is quite small if the gap between "4" and any other token is quite large.

pelillian1y ago

That’s why we use top p and top k! They limit the probability space to a certain % or number of tokens ordered by likelihood

j / k navigate · click thread line to collapse