It boils down to the following:
You can calculate a probability L of losing a given file.
Because we've assumed totally uncorrelated failures that means this is the same for all files, and that the probability of losing NO files if you have T files is (1 - L)^T
As you can see, this approaches 0, meaning Pr(losing a file) approaches 1 as T increases.
Using the probability of file loss in Sia, which I would say is is too low, but lets ignore that. They get L = 10^-19.
This leads to T = ~10^19 before you expect to lose data. If you're erasure coding on the byte level, then that's 10 exa bytes.
I expect your probability of failure is much less than random nodes on a distributed global network of volunteers. so yes, ~petabyte is below the threshold, but there is a threshold.