Here's how I took it: if you're just losing sensitivity at a particular frequency then you may only hear sounds in the 40-100dB range, below it's too quiet to register and above it's painful. That's a lot of information to lose but you can smash the 1-100dB range into the 40-100dB range. If you choose to you could even smash the 1-60dB range into the 40-60dB range (or pick whatever numbers) and leave everything above that relatively untouched. This is a fairly common sound engineering technique to fill out a sound without destroying its dynamics.
So if you picture a scale next to the bear picture from 1-100, then the bottom part of the bear is what's beneath the (effective) noise floor for that frequency. To extend the analogy to multiband compression you'd have maybe 10 bears next to each other, each missing different amounts and each needing a slightly different smashing to lift the bottom of the picture into the visible range.
edit: I think people are assuming that the frequency content of the bear picture corresponds to the frequency content of sound (they're all signals, right?) but to me it's a much more basic analogy. To do it that way you'd have to be turning up the soft reds or something to that effect, but rods and cones being what they are we don't lose vision in a comparable way to how we lose hearing so I don't think there's a good, intuitive visual analog in that sense.