A tip for the author: if you notice a number of your peers doing something, perhaps you should give them the benefit of the doubt and inquire further.
Edit: Even more to the point is that being able to do scalable transformations like computing top N CTR on a lot of data with little regard to available computing/network/disk resources is the reason why you would copy your input data to EC2 for processing. If the author has a point to make he failed to do so beyond making himself look like someone who enjoys labeling things he doesn't understand as a "cargo cult."