Also, peak detection can be improved a lot by searching for local maxima of the level instead of taking a threshold of the amplitude.
As a scientist of acoustics, I find this approach frustratingly naive. Still, nice write-up!
I think we'll circle around and try again in a less naive way :)
I wonder if the algorithm can adapt to recordings where the tempo isn't quite constant.
Amusingly, I first read "beat" as a verb in your title, and was expecting something cloak-and-dagger.
In the testing section, the songs that we fail to identify correctly are usually those we couldn't get a "handle" on in the form of identifying common intervals. A varying tempo would exacerbate that problem.
Additionally, the question becomes: what is the tempo of a song that has an inconsistent tempo?
j/k, very cool :)