For data that important, I would have mirrored the databases to "warm standby" servers. They could have been back up in minutes with no data loss. Sure it would have doubled the cost, but how much money did they lose during the outage.
Otherwise you'd know that they had a fault that propagated to the hot spare. It's also utterly daft to think that a financial enterprise as large as JPM/Chase wouldn't already be running a HA setup. In this case it appears to be Oracle RAC.
I'm astounded how often I have to remind people that replication and backups are very different things, and that you need both.
I'm also depressed how many utterly thoughtless comments are made here on hackernews lately.
At my shop we had a similar issue (but at the SAN level, not the DB level) where the corruption issue was data that exposed a bug in the system. The data was automatically mirrored to the warm standby machine. When PROD crashed, the standby was brought up and immediately crashed also. We had to rebuild from tape backups which was stupid-slow (trademarked term there ;-). All in all it was a horrible mess that was root-caused to a bug in vendor firmware. Eerily similar to the JPMorgan Chase issue in the OP.
I am not saying they didnt have one, just that disaster recovery scenarios should factor into such outages. Hypothetical fire drills etc. are needed at such critical businesses like banks.
My guess is that a bunch of people @ jpmc will most likely be losing their jobs over this.
They were able to identify the problem and successfully recover from backup and successfully replay missing transactions in a reasonable amount of time for the setup this large. In my book it's a success.
Btw, while everybody's salivating over "Oracle failure", was it really Oracle failure, ie. like failure of Oracle?
This usually works, which is why people think it's an acceptable policy. But real planning involves things like software correctness, proper test procedures, ways of making a test environment that's exactly identical to production, and so on. This is hard (and slows down development... "tests, what a waste of time!"), so people instead say, "let's try really hard to not fuck something up".
Trying hard only gets you so far, as JPM learned.
Could happen to any database. Not just Oracle.