When treated like a puzzle it can be really interesting. So I thought I'd share a few tidbits.
1) We did a simple 'speed' test, how many queries per second were coming from an IP, and auto-ban on the limit being exceeded, started at 100qps and watched as the traffic moved down to 99.5qps. Pushed to 10qps and watched the traffic follow it down. Even at 3qps you would get traffic at bit over 2 qps trying to limbo in under the limit.
2) At that time, lots of people who highjacked browsers with toolbars sold scraping as a service to third parties. Their toolbar would check in to see if it should do a query and it would launch a query and return the results without the user even knowing. One company, 80 legs, was pretty up front about their "service", SEO types would use it to scrape Google results to see how their SEO campaigns were doing.
3) The majority of the traffic had criminal intent, looking for metadata on web pages to indicate they were running an unpatched version of some store software or had sql injection bugs. These would often come from PCs that had been compromised for other purposes or "zombie" PCs. We could rapidly map out these networks when we got 100 queries from 100 different IPs looking for "joomla version x.y",p=1 through "joomla version x.y",p=100. We briefly played around with sending them official looking SERPs but all the links went to fbi.gov though an obfuscator.
One of more effective strategies was to field a "black hole" server, basically it was an http server that answered like you had gotten hold of it but then it never sent any data. With some simple kernel mods these TCP connections were silently removed on our end so they took no resources and the client would wait basically forever. We ack'd all keep alive packets with "Yup, we're here." so they just kept waiting and waiting.
It really was a never ending game. We mass banned an entire Ukranian ISP because out of billions of queries not a single one was legitimate.