undefined | Better HN

0 pointsastrange3y ago0 comments

A transformer is a universal approximator and there is no reason to believe it's not doing actual calculation. GPT-3.5+ can't do math that well, but it's not "just generating text", because its math errors aren't just regurgitating existing problems found in its training text.

It also isn't generating "the most likely response" - that's what original GPT-3 did, GPT-3.5 and up don't work that way. (They generate "the most likely response" /according to themselves/, but that's a tautology.)

0 comments

mach1ne3y ago

> It also isn't generating "the most likely response" - that's what original GPT-3 did, GPT-3.5 and up don't work that way.

What changed?

astrangeOP3y ago

It answers questions in a voice that isn't yours.

The "most likely response" to text you wrote is: more text you wrote. Anytime the model provides an output you yourself wouldn't write, it isn't "the most likely response".

afiori3y ago

I believe that ChatGPT works by inserting some ANSWER_TOKEN, that is a prompt like "Tell me about cats" would probably produce "Tell me about cats because I like them a lot", but the interface wraps you prompt like "QUESTOION_TOKENL:Tell me about cats ANSWER_TOKEN:"

1 more reply

meow_mix3y ago

Reinforcement learning w/ human feedback. What u guys are describing is the alignment problem

mistymountains3y ago

That’s just a supervised fine tuning method to skew outputs favorably. I’m working with it on biologics modeling using laboratory feedback, actually. The underlying inference structure is not changed.

ainiriand3y ago

I wonder if that was why when I asked v3.5 to generate a number with 255 failed all the time, but v4 does it correctly. By the way, do not even try with Bing.

j / k navigate · click thread line to collapse