People need to be blamed, and responsibility for actions taken (without covering asses)
I have no empathy for Fastly-the-company. I hate the fact that the Internet is centralized around CDNs. I wish this idea of 'but we _must_ run a CDN for our 1QPM blog!' would die in a fire. But I can still empathize with the Fastly engineers handling this shitstorm right now.
Do a post-mortem, work out root causes, work as a unit to ensure this doesn't happen again.
Obviously if there are levels of gross negligence or misconduct discovered during post-mortem, that will need to be dealt with accordingly, but coming into this with an attitude of "we must find someone to blame and incur repercussions" isn't healthy at all.
We are humans - don't forget that.
edit: forgot some words.
"An atmosphere of blame risks creating a culture in which incidents and issues are swept under the rug, leading to greater risk for the organization."
The best way (in a team), to tackle mistakes, is to ensure the process in place corrects these mistakes. The only way to do that, is a post-mortem/learning from the mistake. If you blame it on some engineer who did it, that guy will eventually be replaced by some other guy, who may make the same mistake.
And we, especially companies, typically only learn if there is something at stake. Stock-price, a job, customers, liability etc.
(Call me old fashioned, but what I learned from it, having no stake in the game, is we are truly demolishing the resilient, decentralised nature of the internet; or already have done so)
Post-mortems make far more interesting submissions IMO, but I suppose people up-vote 'yes down for me too'.
A good leader will take the hit (and the repercussions) for their underlings, compensate customers where compensation can make it better (and offer to make it easy to use fallbacks if this happens again) -- and internally fix the problem so it can't happen again, without throwing anyone to the dogs.
What i think this syntactically invalid sentence is trying to say is:
People need to be blamed, and held responsible for actions taken.
Why do people need to be blamed? Why do we need to make someone the scapegoat? What does being held responsible look like?
Let say we find some sacrificial engineer to pin this on:
* does the downtime magically disappear?
* does the engineer suffering (say losing his job or whatever) make your downtime meaningful? You'll recoup your revenue somehow from it?
* does the fact that there's a scapegoat mean that everyone else at fastly is perfect and it's ok to keep using them?
Emapthy and responsiblity are not mutually exclusive.
This. When people talk about "HugOps", "empathy" and all that when a worldwide incident affecting a huge amount of time critical customers (e.g. trading, hft, cargo, food delivery, etc.) is happening for an hour, it has catastrophic consequences.
I hope the engineers also understand the other side and why we are paying huge sums of cash for their service.