I could be wrong but I believe that the the default is that spiders are blocked and only the "User-Agents" listed are allowed to scrape (but not the disallow pages).
Even Facebook's robots.txt has a hatred for my pseudo-anonymous browser settings. Facebook gives me this (for any page): "Sorry, something went wrong. We're working on getting this fixed as soon as we can."
You just ignore the robots.txt file, crawl slowly, and from distributed virtual machines.
Not that you should do that. Robots.txt is a nicety though, the client doesn't have to respect it, and the server doesn't have to allow your HTTP requests.