I guess the next study would need to look into what manufacturers actually fully adhere to the spec. They've not exactly shown best behaviour in that regard in the past (lying about fsync etc) :(
Importantly this means you can't claim stronger semantics around e.g. atomicity. You can certainly work around most if not all issues, using redundancy, verification and distribution. But in isolation you cannot even properly observe extreme failure scenarios, you can only reduce their probability, and even that is limited.
Unfortunately the world we are in right now just makes the hardware issue even worse with APIs that are prone to introducing bugs in the programs.
Bad hardware should result in the software being extra carefully crafted to balance it out a bit but somehow we ended up with bad hardware and bad software :(