I've been told the 4 bit quantization slows it down, but don't quote me on this since I was unable to benchmark at 8 bit locally
In any case, you're right it might not be as significant, however, the quality of the output increases with 8/16bit, and running 65B is completely impossible on 24GB