I believe that's the full dynamic range that human hearing can possibly process where it's a really tiny signal that a human can actually hear with noise underneath vs a really loud signal that is basically pain. Most humans don't have that range. Note that the issue is that the quiet signal needs to be above the
noise--so whatever your signal is, the noise floor needs to be below the threshold of hearing given that signal (I believe that while for "normal" signals that noise floor needs to be more than -50 to -60dbm down for very quiet signals threshold of detection is only -20dbm further down).
The trick is that our hearing systems are logarithmic (we can't hear a quiet sound next to a loud sound--that's what compression relies on), so they map to floating point numbers better (ie. 16-bit floating point is way more than enough).
24-bits is effectively for recording engineers so they have lots of headroom and don't have to worry about clipping basically at all (6dbm per bit implies about 18dbm of extra headroom which is a LOT).
However, when you calculate non-linear audio effects, you want extra bit depth (generally floating point) because cancellation and multiplication in your intermediate results can really move your noise floor up into bits that humans can actually hear.