This most recent YC round, my co-founder and I used Skydrive to edit our application. Skydrive integrates pretty nicely with Word, even on a Mac, to allow for collaborative editing. It's like the best parts of Sharepoint, minus all the crap, and inside of a modern UI. I'm a diehard Apple user, but I also subscribe to the "right tool for the job" principle ... in this case it worked pretty well.
Anyway, inside the document were links to some private areas of our website that contained demo materials for YC. As requested, they were not password protected, but also not linked from anywhere else. While submitting I ensured that our nginx logs would capture visits to these URL's in a separate log, so we'd know when it was being looked at (sidenote, seeing visitors coming from inside justin.tv + the rincon hill towers is kind of exhilarating).
What surprised me was that almost immediately after we began working on the document, the Bing bot was going apeshit exploring the domain and the 'private' URL's. I had to quickly add a robots.txt to deny all on the root. I thought it was pretty interesting. At first I felt almost violated. But then it seems logical that they'd be indexing every URL in every document stored in their datacenter, why not?
Personally, I'll never use an MS cloud service because of this anecdote - not that it was that likely to begin with.
I use google docs, very sparingly. One of the spreadsheets there contains a URL that is not linked from anywhere else and impossible to guess. If that URL ever gets tripped it will send me an email and the day that happens is the day I'll stop using google services (so far so good, and of course I should say 'google drive' now instead of 'google docs').
Either way, all publicly accesible documents will get indexed sooner or later.
These days we've got a lot more people and they show up all across the board.
Clearly if this is on the homepage, it was voted there by your peers. This kind of knowledge is completely obvious to many of us, but not everyone is on your level. Cut 'em some slack.
> To my bemusement, not only was the friend I was messaging away, I also hadn't even sent the link; I pasted it into the chat window but forgot to hit enter.
On-Topic: Anything that good hackers would find interesting. That includes more than hacking and startups. If you had to reduce it to a sentence, the answer might be: anything that gratifies one's intellectual curiosity."
https://developers.facebook.com/docs/opengraphprotocol/
under "Domain Insights."
tail -f /var/log/apache2/access.log