We're toying with the idea of implementing some sort of wild card that way we can present the urls in natural order. Something like *.google.com to retrieve all urls under google. But we wanted to judge the level of interest first. After all "done" is better than "perfect".
http://urlsearch.commoncrawl.org/?q=com.pbm.www%2F~lindahl
So there's a bug there, but not all the time for ~.
If you didn't see the details in the blog post, Common Crawl is giving out $100 in AWS credit to the first five people who share code that incorporates a JSON file from the URL Search.
http://%2E%2E%2E@harunyahya.com/Ajax/get.comments/oid/4612 176.34.181.212 20120516214328 text/html 912