But there is no call for a cryptographic hash here. This isn't being used as any sort of ID or to verify integrity outside of corruption.
The API works on top of TLS, which already includes cryptographic authentication of all data (usually via SHA-1/2 HMAC or AES-GCM).
The hash would be computed at the client right after reading from disk and right before TLS enryption, and since they seem to terminate TLS at the storage server it would be computed right after TLS decryption and right before storage, so it doesn't seem to provide any gain.
I think they should just remove it, or at least make it optional.
This is all rare, but it does happen. This is why the GCS team wants to know if you are seeing corruption on file upload as it might be some bad hardware failing in a non-obvious way.
There's the write path from B2 receives your bits to when they're stored on disk, for one. You could have unforeseen bugs in the code sitting on the other end of their upload URL (it's probably not all theirs, and even if it was it was written by human developers).
Or B2's internal network path (if they have any) between that and the disk. Ideally that would provide integrity too, but maybe not. They offer a low price point and call out other compromises they make to achieve it (e.g. limited load balancing) - so while I really doubt it, it's remotely plausible they deem the internal overhead of SSL too high.
But then there's the potential for mismatch between "what the customer thinks they uploaded" and "what the customer actually uploaded" too! Less of an issue for now because their API only appears to support uploading files all at once, but eventually I'm sure they'll support a multipart upload scheme like the other platforms do. At which point uploads become more complicated since clients need to retain state and potentially resume. What if a client screws it up and there's some off-by-one error (or whatever)? If you can provide instant feedback, at upload time, that your clients provided bogus data, that's a good thing.
You can argue it's a painful requirement to force on users since it means they have to track/compute it themselves (might be nontrivial for streaming applications), which is fair. But there are enough points of failure, and the numbers so large, that errors happening is a fact and you really need to insure against it. Especially here, your entire reason for existing is to reliably store bits so it's kinda important to get it provably right.
It seems completely sensible to err on the side of caution, especially as a new and relatively unproven platform (as an object storage platform provider I mean, obviously they have tons of experience storing things).
If you're handling data on behalf of others, it's paramount that you checksum data end-to-end. Amazon S3 allows you to do this by sending the MD5 or SHA along with the data. Google GCE allows you to do this with CRCs (which, despite what others in this thread say, are more appropriate for the task than crypto hashes, as long as you use enough bits).
https://github.com/Cyan4973/xxHash
You would think, that if it's just being used as a checksum, anything that passes https://code.google.com/p/smhasher/wiki/SMHasher with high marks would be sufficient.