When asked to prove it, it spelled out the letters one by one, and still failed (ChatGPT asserted the answer is still 2, Claude “corrected” itself to 1). Only when forcing it to place a count beside each letter did it get it correct.
It’s not really about the specific question, that just highlights that it does not have the ability to comprehend and reason. It’s a prediction machine.
If it cannot decompose such a simple problem, then how can it possibly get complex programming problems that cannot be simply pattern matched to a solution correct? My experience with ChatGPT, Claude, and copilot writing code demonstrates this. It often generates code that on the surface level looks correct, but when tested it either fails outright or subtly fails.
Even things like CSS it gets wrong, producing output that on the surface seems to do what you asked but in fact doesn’t actually style it correctly at all.
Its lack of ability to understand, decompose, and reason is the problem. The fact that it’s so confident even when wrong is the problem. The fact that it cannot detect when it doesn’t know is the problem.
It generates text that has high probability of “looking” correct, not text that has a high probability of being correct. With simple questions like the one I posed, it’s obvious to us when it gets it wrong. With complex programming tasks, the solution is complex enough that it often takes significant effort to determine if it’s correct or wrong. There’s more room for it to “look” correct without “being” correct.
> But if you've never tried GitHub Copliot
I’ve used it for almost a year before I cancelled my subscription because it wasn’t adding much value. I found copilot chat a bit more useful, but ChatGPT was good enough for that. I still use ChatGPT when programming: as a tool to help with documentation (what’s the react function to do X, type questions), to rubber duck, to ask for pros and cons lists on ideas or approaches, and to get starting points. But never to write the code for me, at least not without the expectation of significant rewriting, unless it’s super trivial (but then I likely would have written it faster myself anyway).
Basically, I use it like this person does: https://news.ycombinator.com/item?id=41350207