But also, I interviewed for DevOps there and they only hire leet coders, so I suspect they're missing some of the fundamentals needed for a stable system.
There have been major problems with systems I work on where the overriding priority is "get it running now however you can. Find and fix the problem after a little uptime givens you breathing room, and that cycle may repeat a few times before we can identify the root cause and find a solution.