Many lossy codecs utilize the fact that the signals of interest (audio, video, images, whatever) are sparse when viewed in some transform domain- Fourier or wavelet. We apply the transformation and retain only the largest coefficents- these get quantized and transmitted, and then we use the inverse transformation to reconstruct our signal. The 'loss' comes from the thresholding/quantization procedure. It's certainly sparsity driven, but I wouldn't call it "compressed sensing".
"Compressed sensing" should mean "stable reconstruction of my signal using data that is acquired at an optimal rate". The "optimal rate" is roughly proportional to the sparsity of the signal.
Check out the first section of [1].
Do they make use of the fact that smooth signals are sparse in DCT domain? Sure. But this has been true long before compressed sensing was a thing.
AFAIK the specific techniques of compressed sensing have not made their way into industry at all.
Not that they couldn't be applied, I'm fond of "Spatial Sparsity-Induced Prediction for Images and Video: A Simple Way to Reject Structured Interference".
(And, in my view it seems that compressed sensing almost completely diverted academic attention away from techniques that would be useful for signal compression in industry; maybe with a couple more orders of magnitude improvement in computing power the common techniques in that space will become more useful for compression.)