I wrote something that was hashing audiobook files that was taking forever, so I tried using the first N bytes (likely much more than 10kB), but soon found that for any given audiobook, each chapter's MP3 had a large identical header on the front end - I imagine that it was a cover image embedded in the metadata.
I think in the end I just started taking the data from the end of the file, but if you're going with subsets, it's probably better to use a pseudo-randomly selected subset rather than a sequential subset. It doesn't have to be a different pseudo-random subset for each file, but I imagine there's an ideal noise profile in the sampling (maybe white noise is best).