I agree, in general, outages are almost inevitable, but global outages shouldn't occur. It suggests at least a couple of things:
1) Bad software deployments, without proper validation. A message elsewhere in this post on HN suggest that problems have been occurring for at least 5 days, which makes me think this is the most likely situation. If this is the case, presumably given this is multiple days in to the issue, rolling back isn't an option. That doesn't say good things about their testing or deployment stories, and possibly their monitoring of the product? Even if the deployment validation processes failed to catch it, you'd really hope alarming would have caught it.
or:
2) Regions aren't isolated from each other. Cross-region dependencies are bad, for all sorts of obvious reasons.