Way weirder than this is that LLMs are frequently correct in this task.
And if you forgo the counting and just ask it to list the letters it is almost always correct, even though, once again, it never sees the input characters.
Much has been written about how tokenization hurts tasks that the LLM providers literally market their model on (Anthropic Hiaku, Sonnet): https://aclanthology.org/2022.cai-1.2/