undefined | Better HN

story

0 pointsgolol3y ago0 comments

It is an algorithm running a computer. The software is whatever you prompt engineered. That is the algorithm running on the computer.

You know, I think that some people (I see on twitter, probably not you) have a wrong intuition about artificial intelligence. They see models which are fundamentally stochastic as incapable of really ever being truly intelligent. It's "just statistics" or just a "stochastic parrot" and it just learns probabilities instead of real meaning. Perhaps they think that since there is always randomness involved, you can not have the kind of deterministic thought process that we feel we have. The worst offender is then considered to be the old school Markov chain.

I obviously think this is wrong and that's why I like to emphasize that transformers are best interpreted as Markov Chains on a larger state space, and this does actually explain their computational behavior.

0 comments

Mike_123453y ago

I agree with that.

The Transformer architecture does not satisfy the Markov property by formal definition. ChatGPT is not a Markov chain.

However the Turing machine which is executing the algorithm does satisfy the Markov property. So you're talking about a lower level of abstraction where any computation of any algorithm is just "running on a Markov chain".

j / k navigate · click thread line to collapse

0 comments

Mike_123453y ago

I agree with that.

The Transformer architecture does not satisfy the Markov property by formal definition. ChatGPT is not a Markov chain.

j / k navigate · click thread line to collapse