This looks like a fun project, but I can't help but feel like Hadoop experience reports are a little late to the party at this point. Is there anyone out there who doesn't immediately think MapReduce when they see numbers at scale like this? If anything, the tool is overused, not neglected.
Maybe the headline is a bit misleading but I find the post interesting and valuable.
The cost though... that was shocking! It comes out to about $42/month to host the data and a $2 per analysis run. That's very affordable if you're producing results someone might want to pay for.
Do you get back info such as how long each instance took to do its job?
however, in paul's case here he's really just using MR as a quick way to do a parallel computation on many machines. There's no reduce step, it's just taking a single average from each individual song and not correlating anything or using any inter-song statistics.