Well, those smaller floats require less BW to transfer back and forth as well. Perhaps not a reduction linear in the size of the float, as maybe smaller floats require more iterations and/or more nodes in the model graph to get an equivalent result.
But rest assured there's an improvement, it's not like people would be doing it if there wasn't any benefit!