undefined | Better HN

0 pointsandrewla1y ago0 comments

Way weirder than this is that LLMs are frequently correct in this task.

And if you forgo the counting and just ask it to list the letters it is almost always correct, even though, once again, it never sees the input characters.

0 comments

1 comments · 1 top-level

Der_Einzige1y ago

This is the correct take.

Much has been written about how tokenization hurts tasks that the LLM providers literally market their model on (Anthropic Hiaku, Sonnet): https://aclanthology.org/2022.cai-1.2/

j / k navigate · click thread line to collapse