undefined | Better HN

0 pointsjjtheblunt2y ago0 comments

Interesting. Thanks for the comment. What if you ask it to add numbers with more digits?

0 comments

1 comments · 1 top-level

The model is trained with a fixed number of tokens, I don't remember if the models I trained have sinusoidal embeddings or learnable positional embeddings, in the latter there would be no embedding to encode the position, in the former I think it would cause problems with the sinusoidal embedding layer as the sine and cosine would wrap around.

j / k navigate · click thread line to collapse