I don't know if that's true, but there's something important the post doesn't address: the potential declining costs of providing online storage. Might the two not balance each other out for the foreseeable future?
I refer to this most excellent post by BackBlaze, which outlines how they do storage: http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-h...
While we might not see many additional leaps of over 90% reduction in cloud storage costs, I (a) wonder how much headroom such innovation bought Backblaze, and (b) whether their main costs, hard drive, will keep pace with user demands.
You know the kind. Goto any speedtest site and somehow you're getting exactly 12MB down, 3MB up, but it never really seems to add up that way anywhere else. Even to my own office, which I know has an extra 100MB to burst, yet somehow my downloads from the data-center across town are more likely to end up in the 3MB range 99% of the time.
The issue here I think is that "unlimited" really isn't unmetered and it never really was. The industry has often seen that marketing gimmick accompanied with abundant exclusions, fine print, and far less ethical tricks like upload rate limiters, that effectively impose limits behind the scenes.
Part of the catch-22 here is also that with decreased storage costs for online storage providers (that increasingly are building their clusters with consumer component hard drives) keeps an even pace with decrease in cost for home-storage and the amount of data users are wanting to upload.
This means for "unlimited" they can recoup 6x the storage i was using per year through my fees. So the question becomes why it's not sustainable? Too large a company and not sustainable due to employee salaries? Not economical enough storage prices (ie using enterprise SAS disks rather than cheap SATA)? I'm guessing since they buy large quantities of disks, they could get drives for even cheaper than what you get on NewEgg.
This is why I believe that this model _is_ sustainable, assuming that it's done right. Also why I switched from Mozy to Backblaze because I felt that Mozy was gouging me by taking away their unlimited plan and replacing it with a tiered plan.
I actually don't think bandwidth is in the same ballpark as these three items.
Just venting I guess, but beware if you are trusting your data to Mozy...
Give me NFS or SMB access if you must, but I'd love to get in on this cheap consumer cloud storage thing without resorting to lame hacks like using VMs or emulation due to lack of native access for my platform.
Take for example, the case of backing up a folder full of files using rsync over ssh, vs. using the SpiderOak client.
Every time you run a backup job, rsync must examine the local folder _and_ ask the server to examine the remote folder, so it can make conclusions about what needs to be transferred. In short, to do a new backup (a write operations) many reads are also required. Furthermore, those reads tend to be non-sequential (seeking to a bunch of different inodes to stat files, etc.)
If you compare that to the SpiderOak client, it already has a near real-time accurate database of exactly what exists on the server. There's no need to burden the server with a bunch of disk seeks (or any actually) to assess what needs to be done. In short, the backup operation can be accomplished by the server using mostly sequential IO, writing only, because of this added intelligence in the client.
Aggregated across a large population of users, this difference in usage patterns greatly influences the hardware requirements and therefore the cost per GB.
...and by the way SpiderOak will run on just about any platform that Python will, with or without a GUI.
http://www.strato-hosting.co.uk/online-storage-hidrive/index...
(not a Strato customer myself)
Couldn't it be well expected that, encrypted data aside, files with same content are often used by more than one persona?
> Greatly reduce backup & sync time through comprehensive compression and advanced de-duplication (saving you time)
> You are only charged for the compressed de-duplicated data amount (saving you money)
Still it is not clear if they do cross-user deduplication, but I think it is very unlikely because all the content is encrypted with an user-specific key, which I think they don't have access to.
I seem to recall that Dropbox (or another well known online storage startup) implements this strategy. Maybe it works.
Videos, audio and photos and game media will take the bulk of the space. Of those - only photos and a proportion of the videos are likely to be totally unique to a user.
I don't see that they have anything to counter this in their model, and it kind of worries me that if people abuse this then they will remove the feature for all users, and I like my unlimited revisions.
Dropbox is based on Amazon S3, which means that not only do they have storage costs to a 3rd party that are so to speak 'out of their control' but they are also dealing with bandwidth and transaction costs.
I wish them all the best, however I can imagine this being quite expensive for them considering the amount of free users they have.
Really love their "realistic" pricing model, even cheaper with a .edu email address.
Had a lot of problems with CPU usage, may have been the thousands of files in my .git directories...
This leads me to support, they have been overwhelmed and it has been difficult getting then to review my logs. they gave me multiple months free due to my non usage but I decided to cancel when I found arq for mac.
I asked then to cancel my account and give me a years credit so I can give it a try in the future and they credited my account for a year... pretty cool.
Wish them the best of luck!
Written from my mobi...
Also, it's slow to update things. You wouldn't expect this, given that Linux has inotify, but it is.
Their support is rather bad, I've emailed them about legitimate bugs and high CPU usage and SpiderOak not syncing and a whole lot of things, but they never credited me anything. I decided to buy it for a year to back my photos up because my disk is making weird noises and it took them three days to reply to my "PayPal won't let me pay from my balance and I don't want to add a credit card" email, to tell me to add a credit card.
I replied "yes, I don't want to add a credit card", and they haven't replied since I sent it three days ago. With that sales support, I wonder how they sell any copies.
For inotify, the biggest limitation we run into on Linux is that the default system configuration limits a user to watching a relatively small number of folders (6,000 I think, and that includes all subfolders recursively.) You can change this in sysctl if you like, and we may add this change to future packages. In case your curious, the SpiderOak directory watchers are tiny C programs for each platform, and are open source.
FYI -- We've definitely seen high CPU use when syncing hundreds of thousands of small files (source code etc), but this has been greatly improved in the latest beta, which just went out yesterday.
Thanks very much for the feedback.
http://ycombinator.com/newsguidelines.html
I realise the original article uses this title, but a little more peaceful when posted to HN would be nice.
Looks like I'm in the market for a new backup solution.
I tried mozy and a few others, but I always struggled with rather slow upload speed, and ultimately found the much easier way is to buy USB harddisk, which I hide in my workplace and just bring it home once per month to make backups. (I store online only a few files I am actually working on, using Dropbox).