I've used this in the past, in high performance math.
If you have data (vectors, matrices, etc.) that doesn't fit neatly into a SIMD block size, you'll have to zero out fields after the calculation. At this point, it's cheaper to generate a zero on the register than load via memory (cheaper as in the number of CPU instructions.)