undefined | Better HN

0 pointspunee947mo ago0 comments

I ran the below prompt to both Kimi2 and GPT5.

how many rs in cranberry?

-- GPT5's response: The word cranberry has two “r”s. One in cran and one in berry.

Kimi2's response: There are three letter rs in the word "cranberry".

0 comments

einarfd7mo ago

I got the same when trying it with standard gpt5. But when I used the thinking mode I got:

3 — cranberry.

Tried with Claude sonnet 4 as well:

There are 3 r’s in the word “cranberry”:

c-*r*-a-n-b-e-*rr*-y

The r’s appear in positions 2, 7, and 8.

I would expect standard gpt5 to get it right tbh.

samsullivan7mo ago

answering correctly is completely dependent on the attention blocks to somehow capture the single letter nuance given word tokenization constraints. does the attention block in kimi have a more receptive architecture to this?

mustaphah7mo ago

Stop asking LLMs to count!

Text is broken into tokens in training (subword/multi-word chunks) rather than individual characters; the model doesn’t truly "see" letters or spaces the way humans do. Counting requires exact, step-by-step tracking, but LLMs work probabilistically.

It's not much of a help anyway, don't you agree?

phyzome7mo ago

Why stop? It's hilarious to watch AI floggers wriggle around trying to explain why AGI is just around the corner but their text-outputting machines can't read text.

camel-cdr7mo ago

How many rs are in a sentence spoken out loud to you?

Surely we can't figure it out, because sentences are broken up into syllables when spoken; you don't truly hear individual characters, you hear syllables.

mrkeen7mo ago

Plenty of opportunity to tokenise and re-tokenise here: https://mastodon.social/@kjhealy/114990301650917094

jtwoodhouse7mo ago

What does it say about us that we think this is AGI or close to it?

Maybe AGI really is here?

robryan7mo ago

Seems like a good benchmark for AGI. Start with things that are easy for humans but hard for LLMs currently.

mustaphah7mo ago

But they have access to tools (though I'm not sure why they're not using them in this case).

Ask it to count using a coding tool, and it will always give you the right answer. Just as humans use tools to overcome their limits, LLMs should do the same.

FergusArgyll7mo ago

How does reasoning help then?

mustaphah7mo ago

IDK. Probably the model's doing some mental gymnastics to figure that out. I was surprised they haven't taught it to count yet. It's a well-known limitation.

1 more reply

j / k navigate · click thread line to collapse

0 comments

einarfd7mo ago

I got the same when trying it with standard gpt5. But when I used the thinking mode I got:

3 — cranberry.

Tried with Claude sonnet 4 as well:

There are 3 r’s in the word “cranberry”:

c-*r*-a-n-b-e-*rr*-y

The r’s appear in positions 2, 7, and 8.

I would expect standard gpt5 to get it right tbh.

samsullivan7mo ago

mustaphah7mo ago

Stop asking LLMs to count!

It's not much of a help anyway, don't you agree?

phyzome7mo ago

Why stop? It's hilarious to watch AI floggers wriggle around trying to explain why AGI is just around the corner but their text-outputting machines can't read text.

camel-cdr7mo ago

How many rs are in a sentence spoken out loud to you?

Surely we can't figure it out, because sentences are broken up into syllables when spoken; you don't truly hear individual characters, you hear syllables.

mrkeen7mo ago

Plenty of opportunity to tokenise and re-tokenise here: https://mastodon.social/@kjhealy/114990301650917094

jtwoodhouse7mo ago

What does it say about us that we think this is AGI or close to it?

Maybe AGI really is here?

robryan7mo ago

Seems like a good benchmark for AGI. Start with things that are easy for humans but hard for LLMs currently.

mustaphah7mo ago

But they have access to tools (though I'm not sure why they're not using them in this case).

Ask it to count using a coding tool, and it will always give you the right answer. Just as humans use tools to overcome their limits, LLMs should do the same.

FergusArgyll7mo ago

How does reasoning help then?

mustaphah7mo ago

IDK. Probably the model's doing some mental gymnastics to figure that out. I was surprised they haven't taught it to count yet. It's a well-known limitation.

1 more reply

j / k navigate · click thread line to collapse