True 4-Bit Quantized CNN Training on CPU – 92.34% on Cifar-10 (opens in new tab)

(arxiv.org)

3 pointsshivnathtathe1mo ago3 comments

3 comments

> true 4-bit precision

This isn't one of the new block floating point schemes, it's bona fide 4-bit precision weights. It boggles my mind that can actually work.

yorwba1mo ago

Well, the weights are accumulated in full precision and are multiplied by a full-precision scale factor after quantization, and the activations and backward pass are computed in full precision as well, so it's not quite true 4-bit precision training. The resulting model can be stored with just slightly more than 4 bits per parameter, though.

jcalvinowens1mo ago

I really just don't understand how the quantization error doesn't ruin the results. Is there some reading you'd recommend?

I can easily understand how the block formats win.

1 more reply

j / k navigate · click thread line to collapse

3 comments

jcalvinowens1mo ago

> true 4-bit precision

This isn't one of the new block floating point schemes, it's bona fide 4-bit precision weights. It boggles my mind that can actually work.

yorwba1mo ago

jcalvinowens1mo ago

I really just don't understand how the quantization error doesn't ruin the results. Is there some reading you'd recommend?

I can easily understand how the block formats win.

1 more reply

j / k navigate · click thread line to collapse