> just curious why you blame that on containers and k8s?
Crawl, walk, run sorta stuff. We had never just gotten the application/monitoring/everything humming on pure Linux hosts skipping that entirely because "K8s and containers!" When you haven't properly QA'd, vetted, whatever your stack throwing heavy abstraction at it (containers/K8s) is an anti-pattern.
Most companies don't have the resources to run a competent K8s distributed compute infrastructure and as a hiring manager (as much as an IC) I know I have to hire very specific, very expensive people for that role. Good ops folks come with experience in their realm and the newer the tech stack the harder it is to find competent help due to talent market conditions.
I don't blame containers and K8s - I blame the people, and I blame companies/teams for jumping at new tech that often doesn't have a justifiable use-case outside of "we're doing the popular thing!" vs. really considering what the needs of the solution are.
I also have a very low tolerance for downtime, and with those huge abstractions I find stuff gets missed more often, leading to my application being down for my users. I am a KISS engineer.