undefined | Better HN

0 pointsriku_iki2y ago0 comments

absolutely not. Transformer layers already communicate using embeddings, and ASCII would be absolutely less efficient there.

0 comments

And how many bits are in an embedded vector?

12k for gpt3.

It is not bits, but weights

So somehow ascii is less information dense than 12k 32-bit floats per token?

j / k navigate · click thread line to collapse

And how many bits are in an embedded vector?

12k for gpt3.

It is not bits, but weights

So somehow ascii is less information dense than 12k 32-bit floats per token?

j / k navigate · click thread line to collapse