undefined | Better HN

0 points12700180802y ago0 comments

You answered your own question.

0 comments

3 comments · 1 top-level

154573452342y ago· 2 in thread

Can you elaborate?

Your question:

  > How does chat-gpt actually get this right?

Your answer:

  > its output is purely probabilistic, based on existing corpus of text

Because GPT was trained on existing text, some of which included numbers and counting, it's learnt the natural ordering of most common/everyday numbers.

For larger or more complex numbers, it's learnt the patterns behind how they're constructed linguistically, which allows it to output a written count in sequence.

This same pattern recognition doesn't work anywhere near as well for actual numerals (e.g. "47600", instead of "forty seven thousand six hundred"), as the tokenizer tends to break long numerals apart (e.g. into ["476", "00"]).

pbhjpbhj2y ago

In text, we don't often count in series, and it seems likely that we often choose a non-counting sequence: like 'I chose options 1, 2, 7' or 'my code was 0 1 2 5', whatever.

Unless training included line-level skips, rather than just next-word skips (like word2vec) or concept-level associations? At the line level, or paragraph level, ordered numerical sequences are obviously very common in formal texts or in code.

I've seen sentence based training, I suppose for code (which it seems GPT4 excells at) line-level training would be essential.

Anyone recommend a mid-level read on this covering different modes of training and such; I'm happy with a bit of code and undergrad level maths. Thanks.

2 more replies

j / k navigate · click thread line to collapse