(waits for somebody to claim that each request came from a different proxy)
The complaint alleges that the defendant disguised its
web crawler to mask its source IP address and thus
prevented QVC technicians from identifying the source of
the requests and quickly repairing the problem.Seems they were using proxies but you don't want to go blocking IPs making a number of requests, right? they could be legitmate shoppers opening lots of tabs, refreshing, hitting image/resource heavy pages, analytics etc. I'm not a network or server admin though so maybe there are some tools or methods that help identify the bad traffic.
Multiple tabs do not translate to concurrent requests. A shop like QVC should have a high limit and frankly an automatic system that flags stuff like this and notifies proper parties.
At this level you should know this will happen and have placeholders in place.
If every single IP is unique, additional layer would be pattern matching traffic by different network classes and flagging it actively.
Another way to rate limit is by number of pages per session / time.
Then you can have another fail safe method where if X exceeds Y start throttling top X (dynamically scale up until X is returned to normal) while notifying DevOps of the issue.
Etc.
By FAR the most annoying of any of these is when Google, Bing and/or Yahoo decide to wake up and crawl your infrastructure with little regard to your robots.txt or webmaster settings, if available. I think they have got better in recent years, but they used to be the absolute worst. It came down to: Let us DOS you, or have your ranking suffer. Suing Google, Bing, Yahoo isn't exactly an option.
Some context: I was the lead architect/engineer combo for a CMS that hosted ~500k domains for a fairly large international company. Some days I could login and see them crawling every domain from A-Z. Some days I would get caught by Google and Bing at the same time. They were the largest consumers of data on this system.
In fact for a while we would get Bing (MSN bot back then) crawl us everyday at the same time, almost on the dot.
Let me plug project honeypot (which I am in no way affiliated with). This is truly an awesome, and surprisingly accurate, free, service that does an amazing job at collecting heuristics on suspicious IP activity and exposing it in a easy to interpret way..
To you point, if you are a large provider, especially one that passively and actively sends a lot of money towards the search engines, there are some additional options at your disposal. We (the business units) had contacts with adsense etc, which would come in handy.
There is one form of internet justice, which is QVC should file abuse complaints to the ISPs that host those IPs. I've found abuse complaints are the best way to stop people from using IPs for bad activities (excessive scraping, spamming, etc).
The complaint alleges that the defendant disguised its
web crawler to mask its source IP address and thus
prevented QVC technicians from identifying the source of
the requests and quickly repairing the problem.
Your comment about crawler politeness is spot-on.At the same time, the onus shouldn't be on QVC to have to block this, result.ly should either be a good citizen or have to face a lawsuit. Granted, QVC's tech team should be able to deal with this because next time the person who is DOSing them might not be a US entity which can be sued, but that isn't entirely relevant in this situation.
That's a strange claim given that we're talking about a "contract" which QVC has no proof that the other party read or agreed to, and which there has been no explicit exchange ("offer" and "acceptance").
Are web-site contracts/terms even enforceable at all? According to this article[0]/case law likely not. Strange thing for a lawyer to say, but this article makes a lot of strange claims that seem inconsistent with US case law.
[0] http://www.forbes.com/sites/oliverherzfeld/2013/01/22/are-we...
Trying to enforce a contract on a crawler that's just fetching pages without ever checking a box is much more difficult... many failures in the past.
Also, "clearest source of a remedy" refers to ability to actually get compensation, which may be limited or difficult to argue for under DMCA/CFAA.
Still yeah, that is too much.