Now without crying: I saw multiple, big companies getting rid of NOC and replacing that with on duties in multiple, focused teams. Instead of 12 people sitting 24/7 in group of 4 and doing some basic analysis and steps before calling others - you page correct people in 3-5 minutes, with exact and specific alert.
Incident resolution times went greatly down (2-10x times - depends on company), people don’t have to sit overnight and sleep for most of the time and no stupid actions like service restart taken to slow down incident resolution.
And I’m not liking that some platforms hire 1500 people for job that could be done with 50-100, but in terms of incident response - if you already have teams with separated responsibilities then NOC it’s "legacy"
Step 1: You start out with the founders being on call 27x7x365 or people in the first 10 or 20 hires "carry the pager" on weekends and evenings and your entire company is doing unpaid rostered on call.
Step 2: You steal all the underwear.
Step 3: You have follow-the-sun office-hours support staff teams distributed around the globe with sufficient coverage for vacations and unexpected illness or resignations.