What ChatGPT actually sees when you input that question is the output of the tokenizer:
[5299, 1991, 18151, 553, 306, 290, 2195, 392, 491, 33465, 69029]
This happens to be 11 tokens, but I think that's a coincidence. Token 491 is "int" and token 33465 is "elligence", but ChatGPT doesn't actually see the letters.
How can you expect it to count, given those limitations? It had to guess how many letters each token represented. It got close, but not exact.
This is an artificial example pretty much maximally designed for ChatGPT to screw up.