undefined | Better HN

0 pointsianopolous6y ago0 comments

To clarify, the assumptions I'm making for the calculation are:

1) a Fixed probability of a server failing

2) a fixed erasure coding scheme used for all files

3) uncorrelated server failures

4) an erasure fragment is stored on a random server

0 comments

ianopolousOP6y ago

It boils down to the following:

You can calculate a probability L of losing a given file.

Because we've assumed totally uncorrelated failures that means this is the same for all files, and that the probability of losing NO files if you have T files is (1 - L)^T

As you can see, this approaches 0, meaning Pr(losing a file) approaches 1 as T increases.

Using the probability of file loss in Sia, which I would say is is too low, but lets ignore that. They get L = 10^-19.

This leads to T = ~10^19 before you expect to lose data. If you're erasure coding on the byte level, then that's 10 exa bytes.

I expect your probability of failure is much less than random nodes on a distributed global network of volunteers. so yes, ~petabyte is below the threshold, but there is a threshold.

j / k navigate · click thread line to collapse