story
Lots of things are Turing complete. We don't usually think they're smart, unless it's the first time we see a computer and have no idea how it works
An LLM is a markov chain mathematically. We can build an LLM with a context window of one token and it's basically a token frequency table. We can make the context window bigger and it becomes better at generating plausible looking text.
Is it possible that beyond becoming better at generating plausible looking text – the expected and observed outcome – it also gains some actual intelligence? It's very hard to disprove, but occam's razor might not be kind to it.