undefined | Better HN

0 pointsImnimo3y ago0 comments

What is your proposed mechanism by which using previous tokens allows it to "build [a character] for itself"?

0 comments

2 comments · 1 top-level

xg153y ago· 1 in thread

I think the one hard limit that we can currently assume is that the "long term memory" (i.e. model weights) is fixed. There is no information persisted between sessions.

So in that sense you're right that there is no way it can "build a character" from continued conversations. If there is anything resembling "personality", it must have emerged during training or fine-tuning.

However, I do think it's possible that some kind of "world model" which includes reasoning about "itself" has emerged during training - and that world model influences how responses are generated.

As for how that squares with the token-by-token generation, keep in mind that the model gets the entire previous conversation as input (or at least the last n tokens, with n=4096 for ChatGPT I think). So you could imagine this as trying to continue a hypothetical conversation: "If user had said this and then I had said that and then user had said that other thing, what would I say next?"

Or later: "If user had said this, etc etc, and I had started my answer with 'well, actually', which word would I say next?

This process is repeated token for token - for each token, the bot can consult the previous conversation so far and the entire information stored in the model to find a continuation. And we currently don't really know what kind of information is actually stored in the model.

And even then, it's not even restricted to only reason about the next word, it's just restricted to only output the next word.

E.g., there was another thread about how the models chooses between emitting "a" and "an" in a way that matches the context: e.g., if you ask chatgpt about what yellow, bendy fruit is in your basket, it might output "a" followed by "banana". How can it know that it has to output "a" when it hasn't yet predicted the token "banana"? One possible answer could be that it already predicted "banana" internally but didn't output it yet. (And then in the next iteration, will repeat the calculations that made it arrive at "banana", this time actually outputting the word.)

That last part is speculation from me though.

There are definitely limits though. I tried to have ChatGPT generate a program but output it reversed and so far the answers were just gibberish.

ImnimoOP3y ago

While it's possible that it could repeat (in a loose sense - the contents of the context window has shifted, so the exact calculations in the next forward pass will necessarily be different strictly speaking) the calculations that arrived at "banana", there's nothing enforcing this. It's coming at it fresh, and even if it had output "a" on the basis that "banana" was a likely continuation, it could just as well decide on "plantain" given the new context. That's what I mean when I say it can't plan two steps ahead - any planning it does is lost by the time it gets to that second token.

Further, the amount of actual planning (or thinking) it can do in a single forward pass is quite limited compared to what can be done over the course of a long output - that's why tricks like "let's think step-by-step" are so powerful. If it could plan out the entire response in one forward pass, it could equally output the answer directly. But the depth of the network limits multi-step reasoning. To have a persistent long-term plan of a " a sophisticated manipulator" (as the article calls it) seems clearly impossible.

j / k navigate · click thread line to collapse