The test is not repeatable. There was a power outage. Random machine configurations on random network conditions. These are all great ways to stress redis-cluster, which is still young, but they are not repeatable tests.
They are not great tests. Great tests would be a script that can run every 24 hours over a group of servers, simulating or creating similar events and network conditions. Then when a code change is made, you can actually observe if it made things better.
Driving a car across the Sahara Desert can be a good event, but it shouldn't replace Crash Tests in a Lab environment.
However I believe that I'll still do the kind of tests described in the blog post since the fact itself that these tests are so "home brew" and cheap make the conditions extremely random and are good at spotting actual real-world deficiencies.