I have no hard-earned experience with this myself. I've just been thinking about it myself a lot because I have a potentially write-intensive application in the works, and I'm anticipating aperformance issues with writes (but I should probably find out if it actually is an issue). My goal is to squeezing as much performance as I can from as few machines - I'm poor.
To explain my use case.. I'm basically storing a graph in MongoDB, which is a document/object based database. Reads would consist of enumerating a node's edges. Writes would be adding nodes/edges and updating edge weights. It would perform best on reads if I just had one contiguous object for each node, holding information about all nodes connected to it, as well as edge weights. But since some nodes may have too many connected nodes, they are divided in the database by the page they would show up on in the application, as ranked by edge weight. So an object with id 'node1page1' holds information about the top 10 ranked connecting nodes.
Now, if that made any sense, the problem: Edge weights change according to votes from users. So when a weight changes, ranks change, and I've structured things in such a way that the computation of ranks is basically pushed to the write rather than the read. So when someone checks 'node1page2' and votes a node on page 2 up, the corresponding edge weight changes and may require that node's information to be shifted to 'node1page1', while the lowest edge on 'node1page1' is to be shifted to 'node1page2'.
So what's happened here is that while I've made reads fast by requiring just a lookup, writes become complicated. And here's where the buffering of writes comes in. Perhaps I'm overthinking all this, but like I said, I'm poor.
Replication in-memory would definitely be useful, especially for a larger cluster with some spare capacity. You'd probably have good power redundancy when running a large cluster, so single machine failures would be much more common than a total cluster failure I'd think (no hard data here, but seems reasonable enough). So I don't think it's all that naive to believe that generally more than one host would not go down within 30s.
Even if the whole cluster does go down, you should only be doing this with low-value data. In my application's case, losing 30 seconds of votes is no big deal at all, I'd be much more concerned with getting the machine(s) back up.