undefined | Better HN

0 pointskmod6y ago0 comments

Sure, let's dig into networking. Who pays for rereplication traffic? If you do 64-of-96 RS encoding, that means for every failure you need to transfer 64x the lost storage capacity. If you're targeting a "low individual uptime but high aggregate uptime" model this means you need to be storing data in multiple sites -- and dedicated cross-geo bandwidth is expensive. I agree that in the happy case you can use low-bandwidth cheap equipment, but to get good reliability you need to provision for larger clustered failures such as rack- and row-level outages.

0 comments

Taek6y ago

Sure! First up, we don't do repairs every time one host goes down. Standard practice on the network is to wait to do a repair until a full 25% of the redundancy is missing (in 64-of-96, that would be 8 hosts offline). Then you repair all 8 at once, significantly reducing the total amount of repair traffic.

But secondly, offline doesn't usually mean dead and gone, with unstable datacenters like this they are usually back online before the user has lost a full 25% of their redundancy.

Row level and rack level outages are handled by data randomization. The entire Sia system heavily depends on probabilistic techniques, both on the renting and hosting side. Row level failures will take out some of your data, but nobody should be disproportionately impacted by a cluster failure.

On Sia, each piece is at a different site. So 64-of-96 implies that each chunk of data (96 pieces to a chunk) is located in 96 different places. This doesn't help with the geo-bandwidth, but as discussed above there are other techniques to handle that.

Surprisingly, bandwidth pricing on the Sia network is even cheaper than storage pricing relative to centralized competition. That's a lot harder to model at scale though, so we aren't as confident the Sia bandwidth pricing will hold up at $1 / TB in the long term.

And technically, most of this stuff is customizable per-customer. If your particular use case has a different optimal parameterization, it's fairly easy to tune your client to suit your particular needs.

jstummbillig6y ago

So, is this it? Dropbox designer challenges the project broadly, gets top comment, Author refutes – and we leave it at that?

I mean this is basically the moment where I would expect every systems designer on HN coming out of the woods and crushing Sia into the ground, if there were, in fact, any ground at all to crush Sia into.

Is this actually legit? If so, where is the rejoicing? What am I missing?

zzzcpan6y ago

No, the author's idea is ok and he's mostly right on the cost part. Configuration and the numbers are a bit off and unrealistic, you won't get such low 95% availability per site due to other economical and technological constraints, you'll get at least 99%, but probably closer to three nines per site and 64 out of 96 won't be necessary at all (something like 8 out of 12 could be enough). Dropbox designer is just ignorant, biased and conditioned to US market and environment, but appeals to authority, so people upvote his bad comment. I do storage too, on smaller scale than Dropbox of course, not in the US, but it is distributed and the cost is already lower than what you see in the title.

Ajedi326y ago

The fact that nobody is able to prove it's a bad idea doesn't necessarily mean it's a good one. There might still be other downsides that haven't been considered, some of which could be solveable with more development work and some not.

At this point the cautious skeptic will be thinking "hmm, maybe there's something to this", not necessarily full on rejoicing.

That said, I agree it does seem promising. If you ever find yourself in need of cheap cloud storage it wouldn't hurt to look into Sia as a possible option.

fragmede6y ago

Dropbox designer name dropped "64-of-96 RS encoding" as if they're the only person that's heard of, or dealt with Reed-solomon encoding before, and expected the author to get scared off. There is, in the case of drop box, plenty of ground to crush Sia into. That is the ground between the 95% and multiple-nines of availability.

Engineering is about tradeoffs. I could build a network as good as Google's with infinite money, infinite time, and infinite help. I could design a product as beautify as Apple's with the same lack of limitations. Unfortunately for me, I have limited money, limited time, and limited help. Every systems designer understands that, innately, so isn't rushing out of the woodword because Sia and Dropbox have merely chosen different tradeoffs. That one has IPO'd is uninteresting in the abstract. It's just money after all.

1 more reply

Legogris6y ago

That sounds incredibly energy-inefficient. On average you have 12.5% of servers running but not contributing and possibly incurring load on other nodes.

H8crilA6y ago

12.5% overhead isn't that much. It's what just the networking gear can easily eat in a data center (12% out of all the non-cooling-related power supply).

Reed-Solomon encoding adds 50%, of you want 3 block per 2 data blocks. Replicated encoding (not relevant here since this is allow throughput usecase, but necessary if you want to sustain high read throughput) is adding at least 200% (if you want a 3x replication, which I think should be the minimum).

1 more reply

monocasa6y ago

Turn 'em off if they're not contributing.

1 more reply

dahfizz6y ago

87.5% efficient is not incredibly inefficient.

manigandham6y ago

Compared to what though? How efficient is a typical data center? Probably way less than this.

DuskStar6y ago

I think his point is that he's targeting low aggregate uptime, too.

Taek6y ago

Definitely not, aggregate uptime is extremely high. We've never seen downtime do to network outages, only software bugs. And even then, only some users were impacted by the bugs, we've never in 5 years had a broad outage.

DuskStar6y ago

Gotcha, and your reply to the GP clarified a lot for me!

j / k navigate · click thread line to collapse

0 comments

Taek6y ago

But secondly, offline doesn't usually mean dead and gone, with unstable datacenters like this they are usually back online before the user has lost a full 25% of their redundancy.

jstummbillig6y ago

So, is this it? Dropbox designer challenges the project broadly, gets top comment, Author refutes – and we leave it at that?

Is this actually legit? If so, where is the rejoicing? What am I missing?

zzzcpan6y ago

Ajedi326y ago

At this point the cautious skeptic will be thinking "hmm, maybe there's something to this", not necessarily full on rejoicing.

That said, I agree it does seem promising. If you ever find yourself in need of cheap cloud storage it wouldn't hurt to look into Sia as a possible option.

fragmede6y ago

1 more reply

Legogris6y ago

That sounds incredibly energy-inefficient. On average you have 12.5% of servers running but not contributing and possibly incurring load on other nodes.

H8crilA6y ago

12.5% overhead isn't that much. It's what just the networking gear can easily eat in a data center (12% out of all the non-cooling-related power supply).

1 more reply

monocasa6y ago

Turn 'em off if they're not contributing.

1 more reply

dahfizz6y ago

87.5% efficient is not incredibly inefficient.

manigandham6y ago

Compared to what though? How efficient is a typical data center? Probably way less than this.

DuskStar6y ago

I think his point is that he's targeting low aggregate uptime, too.

Taek6y ago

DuskStar6y ago

Gotcha, and your reply to the GP clarified a lot for me!

j / k navigate · click thread line to collapse