https://web.archive.org/web/20200310174634/https://people.xi...
The trick is that our hearing systems are logarithmic (we can't hear a quiet sound next to a loud sound--that's what compression relies on), so they map to floating point numbers better (ie. 16-bit floating point is way more than enough).
24-bits is effectively for recording engineers so they have lots of headroom and don't have to worry about clipping basically at all (6dbm per bit implies about 18dbm of extra headroom which is a LOT).
However, when you calculate non-linear audio effects, you want extra bit depth (generally floating point) because cancellation and multiplication in your intermediate results can really move your noise floor up into bits that humans can actually hear.
The effect of bit depth has little to do with how you perceive the sound; what adding more bits does is allowing for more dynamic range, i.e. more difference between the loudest possible and the quietest possible sound. More bits brings down the noise floor. This means that for example the final part of a fade-out retains more detail at 24 bits than at 16, but this difference is not something that you would be able to observe in normal listening conditions.
If you like to learn more about the effects of bit depth, I would recommend “Digital Show & Tell” by Xiph Mont at https://www.xiph.org/video/.
That really doesn't make any sense. The bit depth provides for a dynamic range, meaning the difference between the loudest and quietest sounds which can be encoded. 16 bits is enough to go from "mosquito in the room" to "jackhammer right in your ear". Congratulation, 24 bits let you go up to "head in the output nozzle of a rocket taking off" with room to spare, that's… not very useful?
Now what might make sense — aside from plain placebo — is a difference in mastering. For instance lots of SACD comparisons at the time were really comparing differences in mastering, with the SACD converted to regular CDDA turning out way superior to the CD version because the mastering of the CD was so much worse.
The "Loudness Wars" is an especially bad period of horrible mastering, and it went from the mid 90s to the early-mid 2010s (which doesn't mean that regular-CD has gone back to "super awesome", just that you're unlikely to have clipping throughout a piece these days).
Dynamic range is not loudest sound / quitest sound ratio (as would one expect), but loudest sound / noise level ratio. Otherwise you would need to count additional bits to encode quietest sound with low enough quantization noise.
Threshold of hearing could be as low as -9 dB SPL, so one wound want noise level below that. Therefore with 96 dB dynamic range from 16 bits the loudest representable sound would be say 86 dB SPL. But symphonic orchestra music may have peaks way above 100 dB.