undefined | Better HN

0 pointsmort9612d ago0 comments

I don't understand what part of what I said you disagree with.

0 comments

You state how you think and plan and have thoughts on how to do things etc. and i assumed you mention your way of thinking because you assume a LLM is not doing any of it.

I showed than counter examples.

mort96OP12d ago

I don't think you showed counter examples? Or can you link me to a paper which describes a language model thinking without predicting tokens?

AntiUSAbah12d ago

My second sentence references all these papers:

"COCONUT, PCCoT, PLaT and co are directly linked to 'thinking in latent space'. yann lecun is working on this too, we have JEPA now."

mort96OP12d ago

And it does this thinking without producing tokens?

1 more reply

CamperBob212d ago

If you ask a model to multiply 322423324 by 8675309232 without using tools, it's interesting to think about how it does it. Where are the intermediate results being maintained?

"In context" is the obvious answer... but if you view the chain of thought from a reasoning model, it may have little or nothing to do with arriving at the correct answer. It may even be complete nonsense. The model is working with tokens in context, but internally the transformer is maintaining some state with those tokens that seems to be independent of the superficial meanings of the tokens. That is profoundly weird, and to me, it makes it difficult to draw a line in the sand between what LLMs can do and what human brains can do.

1 more reply

j / k navigate · click thread line to collapse

0 comments

AntiUSAbah12d ago

You state how you think and plan and have thoughts on how to do things etc. and i assumed you mention your way of thinking because you assume a LLM is not doing any of it.

I showed than counter examples.

mort96OP12d ago

I don't think you showed counter examples? Or can you link me to a paper which describes a language model thinking without predicting tokens?

AntiUSAbah12d ago

My second sentence references all these papers:

"COCONUT, PCCoT, PLaT and co are directly linked to 'thinking in latent space'. yann lecun is working on this too, we have JEPA now."

mort96OP12d ago

And it does this thinking without producing tokens?

1 more reply

CamperBob212d ago

If you ask a model to multiply 322423324 by 8675309232 without using tools, it's interesting to think about how it does it. Where are the intermediate results being maintained?

1 more reply

j / k navigate · click thread line to collapse