You are right in the abstract, but these are very high-resolution "lossy" signals. If you look at the spectral response of the bird songs it falls well inside the part where there is no signal loss. Your complain is like bothering about a face recognition system because it used lossily compressed but very high-resolution images. That criticism would be just as nonsensical.
A more meaningful criticism would be the use of the wigner transform itself, which seems to produce ringing artifacts in the visualization, not seen on the more common windowed fft.