The basic problem: the quieter a sound or detail gets, the fewer bits of resolution are used to represent it.
In 16-bit recording, there simply aren't enough bits to represent very low level details without distorting them with a subtle but audible crunchy digital halo of quantisation noise.
In a 24-bit recording, there are.
Talking about dynamic range completely misses the point. It's the not the absolute difference between the loudest and quietest sounds that matters - it's the accuracy with which the quieter sounds are reproduced.
This is because in a studio, 0dB full-scale meter redline is calibrated to a standard voltage reference, and both consumer and professional audio has equivalent standard levels for the loudest level possible.
These levels don't change for different bit depths, and they're used on both analog and digital equipment. (In fact they've been standard for decades now.)
This is why using more bits does not mean you can "reproduce music with a bigger dynamic range" - not without turning the volume up, anyway.
What actually happens is that the maximum possible volume of a playback system stays the same, but quieter sounds are reproduced with more or less accuracy.
In a 16-bit recording quiet sounds below around 50Db have 1-8 bits of effective resolution, which is nowhere near enough for truly accurate reproduction. (Try listening to an 8-bit recording to hear what this means.)
You might think it doesn't matter because they're quiet. Not so. 50dB is a long way from being inaudible, ears can be incredibly good at spectral estimation, and your brain parses spectral content and volume as separate things.
There's a wide range between "loud enough to hear" and "too loud" and 24-bit covers that whole range accurately. 16-bit is fine for louder sounds, but the quieter details just above "loud enough to get hear" get audibly bit-crushed.
The effect isn't glaringly disturbing, and adding dither helps make it even less obvious. But it's still there.
24-bit doesn't need tricks like dither - because it does the job properly in the first place.
Now - whether or not commercial recordings have enough musical detail to take full advantage of 24-bits is a different question. For various reasons - compression, mastering, cheapness - many don't.
But if you have any kind of aural sensitivity, you really should be able to A/B the difference between a 24-bit uncompressed orchestral recording and a 16-bit recording using an otherwise identical studio-grade mixer/mike/recorder/speaker system without too much difficulty.