If you replace the failed(or failing) node right away, the failure percentage goes down greatly. You would likely need the probability of a node going done in 30 minutes time space. Assuming the migration can be done in 30 min.
(i hope this calculation is correct)
If 1% probability per month then 1%/(43800/30) = (1/1460)% probability per 30 min.
For three instances: (1/1460)% * (1/1460)% * (1/1460)% = (1/3112136000)% probability per 30 min that all go down.
Calculated for one month (1/3112136000)% * (43800/30) = (1/2131600)%
So one in 213 160 000 that all three servers go down in a 30 minute time span somewhere in one month. After the 30 minutes another replica will already be available, making the data safe.
I'm happy to be corrected. The probability course was some years back :)