When organizations scale up, and especially when they're dealing with risks, it's easy for them to shift toward the controlling end of things. This is especially true when internally people can score points by assigning or shifting blame.
Controlling and blaming are terrible for creative work, though. And they're also terrible for increasing safety beyond a certain pretty low level. (For those interested, I strongly recommend Sidney Dekker's "Field Guide to Understanding Human Error" [1], a great book on how to investigate airplane accidents, and how blame-focused approaches deeply harm real safety efforts.) So it's great to see Slack finding a way to scale up without losing something that has allowed them to make such a lovely product.
[1] https://www.amazon.com/Field-Guide-Understanding-Human-Error...
We had a guy who more or less appointed himself manager when previous engineering manager decided he couldn't deal with the environment anymore, his insistence on controlling everything resulted in a conscious decision to destroy the engineering wiki and knowledge base and forced everyone to funnel through him-creating a single source of truth. Once his mind was made up on something, he would berate other engineers, other developers and team members to get what he wanted. Features stopped being developed, things began to fail chronically, and because senior leadership weren't made up of tech people-they all deferred to him-and once they decided to officially make him engineering manager (for no reason other than he had been on the team the longest-because people were beginning to wise up and quit the company), the entire engineering department of 12 people except for 2 quit because no one wanted to work for him.
Imagine my schadenfreude after leaving that environment to find out they were forced to close after years of failing to innovate, resulting in the market catching up and passing them. Never in my adult life have I seen a company inflict so many wounds on itself and then be shocked when competitors start plucking customers off like grapes.
[1] https://www.amazon.com/Why-Does-He-That-Controlling-ebook/dp...
The reviewer rejected my Slack add-on twice, but was really nice about it, gave specific reasons, encouraged me to fix it and reapply, etc.
A very pleasant experience compared to some of the other systems where it feels like you're begging to be capriciously rejected.
I strongly dislike repetitive mental work, and writing a checklist is essentially resigning myself that such work will be necessary. Until I write it, I can still convince myself I'll be able to automate the process.
Similar boring processes in a tech business can be checklisted to increase efficiency, and the checklist itself can be iterated over. Everything from designing a new feature, to troubleshooting an error, to addressing a customer support ticket, to getting access to a new resource should use checklists.
If I had to switch to some other note taking platform it would probably break my flow enough that I wouldn't do it.
People too often get used to the routine and will end up skipping bits from checklists, or even outright missing stuff. It's strange, given whole idea of there being something to actually check would hopefully mitigate that.
Generally I do my utmost to avoid having checklists, so that they're left for things where they can't be avoided (e.g. where automation makes no sense, or potentially makes things worse)
It seems like some simple checklist app but having a non Jira process that takes only a few minutes is so valuable, and "security reviews" and "threat models" as part of your SDLC take insane amounts of time and honestly aren't super helpful.
That's a lot of people...
However - I would want to caution: I think this model works because Slack has a self-described "culture of developer trust". I tend to think, they hire bright engineers and ensure they are equipped to do the right thing. I believe the vast majority of organizations are NOT ready for this. I direly want them to be, but simple fact is there are too many mediocre developers, and they can't be trusted without guardrails (and some straight up need babysitters).
No seriously, I was wondering if that tool has a CLI interface? Might make it more accessable for some devs.
What? They spend 1000 minutes out of every 1440 deploying to production? The deployment process is occurring over 16 hours out of every 24? Am I the only one who is nonplussed by this?
EDIT: Ok I get it, I get it. I guess I always worked in much smaller companies where CD meant deploying about 10 times a day tops. TIL big companies are big.
A culture of continuous deployment is often hard to fathom for people who've never worked at a company with one. Everything, down to what you write and how you write it, is influenced by being able to deploy it and see its effects almost instantly.
These aren't huge, sweeping changes being deployed. They're small pieces of larger feature sets. It's more like: Deploy a conditional statement with logging and confirm from the logs that it works; next, deploy the view that you're testing with a feature flag toggled in a way that you and the pm can see it that is properly called from the conditional from before; when things look good, deploy some controller code that handles form requests from the view, etc.
You deploy small changes piecemeal and so spread out the risk over a larger period of time. It makes identifying issues with a new piece of code almost trivial. Needing to debug 30 lines of code is so much less harrowing than needing to look over 900.
(Likely) various groups of people are deploying to production throughout the day. Out of those 100 deploys, an individual is probably only involved in 1 or 2 a day. As soon as you're ready to deploy your code, you queue up and see it all the way through to production along with probably a few other people doing the same thing.
The actual "change the servers over to the new production code" process is usually instantaneous or extremely quick, the 10 minutes is mostly spent testing/building/etc.
People (including myself) enjoy this because you can push very small incremental changes to production, which significantly reduces the chance of confounding errors or major issues.
Note that this is would be a Sisyphean task if your company doesn't have great logging/metrics reporting/testing/etc.
I was excited for the move to a large corporation where there would be amazing room for growth and learning.
I have to say that almost a year into my work on this project, I was absolutely stunned how inept this company was at coordinating a technology project.
Something a small team could accomplish in a matter of months was taking 100's of developers and 100's more in supporting / operational roles years to accomplish. My guess is the developers on this project would gladly trade places with Sisyphus.
There are some legitimate needs for continuous deployment, the rest of it is cargo culting.
Instead imagine each deploy as an edge, and imagine them to be near instantaneous. With respect to an instance, and a user, and the observable effects of a deploy, this paints a more accurate picture. 100 times a day means one deploy every 14.4 minutes.
How many times a day do you think Amazon deploys changes? Or Facebook? Or Google?
So having more pushes per day isn't necessarily the metric to maximize. Quality of code changes for each push is important, and this is where automated testing can be very valuable. The goal is for automated testing to be a "gatekeeper of bad code".
But even this system isn't perfect, and it's possible to deploy things that pass tests but still have show-stopping bugs. Or for the code to cause your tests to misbehave - I'm seeing this now with Tape.js on Travis, where Tape sees my S3 init calls as extra tests. Then my build fails because - of the 2 tests specified, 3 failed, and another 4 passed.