Tangential:
> Is a given audio file a sample of a kick drum, snare drum, hi-hat, other percussion, or something else? (...) Humans have no trouble classifying these two sounds, as we’ve likely heard them tens of thousands of times before.
Are people taught that in schools or something? Because I personally can't classify those sounds, don't know these names, and I'm not sure how I was supposed to learn them, other by playing in a band.
This is something that is taught at schools with a music program. (Although, not necessarily discretely).
If you are someone whom has played music before, it is easy to forget what music sounded like before your ear was trained. (i.e. certain instruments and harmonies can be indistinguishable without training)
Is it common to have never played on at a drum kit in your entire life?
> Is it common to have never played on at a drum kit in your entire life?
I didn't, not on a real one at least. I know the sounds though, I spent ungodly amount of time playing on an electronic keyboard as a kid, where I could (and often would) change the sounds under keys to drums. However, nowhere (AFAIR) were the names of those sounds mentioned, and I'm not sure where I could encounter them.
A kick drum is the big foot operated one on the floor. Congrats you now know the name for the label and you can most definitely pick out a kick drum sample out from other samples.
An example of one open source speaker recognition project is [bob.bio.spear](https://pypi.org/project/bob.bio.spear/), which I haven't tried, but looks promising.
- I did consider using image processing techniques as opposed to decision trees, but the point here was not to come up with the most advanced and accurate classifier possible, but rather to build something simple and explainable to folks without an ML (or even a CA) background.
- I haven't tried this extensively on non-drum-like percussion, but that'd be a great follow up post.