Examples I can think of is Rabbit MQ and Cassandra. But in general, we have some really battle-tested software these days that has become simpler to configure and run over time. People seem scared to run their own these days.
I happen to disagree strongly, though: lots of engineers in my experience undervalue the work of systems administrators and underestimate the effort needed to operationalize any technology.
Running your own is absolutely fine if you are willing to keep your stack small and invest time learning the tools you pick. But there are still horror stories of people thinking snapshots are backups, turning the wrong knobs and turning off fsync on their databases, ...
Most small startups are actually the ones who don’t really need SaaS services.
But developers are part of this problem too. There's plenty of times where I see devs immediately reach for tools instead of learning just a little bit more about what they already have. My favorite example is when folks want to add a NoSQL db into the mix on top of a traditional db. Not because there's a real performance need, but because for their use case it is 'easier'. Never mind that their problem possibly could have been solved by just writing their own SQL instead of trusting a garbage ORM...
This comparison is the #1 flawed sales tactic the cloud companies use to convince you youre saving money
> This comparison is the #1 flawed sales tactic the cloud companies use to convince you youre saving money
Time is of a limited quantity and time spent managing postgres backups (for example) is time not spent doing other (possibly more meaningful/impactful _to the business_) work.
Alternatively that practiced engineer could have spun up a self-managed ES cluster in a couple of DCs in about the same time, but now has the obligation to maintain those servers (patching, etc.). Maybe that marginal cost is damn near zero - chef has been deployed to all instances and enforces patching and there's already good security monitoring in place, etc. The cost of that engineer managing that box, as with a managed ES in AWS, is practically nothing.
TL;DR: as in all cases, it depends.
We've seen our teams go both from managed to non-managed and non-managed to managed with relative success - to give scale, across all of our accounts we spend way north of $3 million/month at AWS so this has happened within our realm a quite few times. The short, unsatisfying answer is that _it depends_. We have an internal policy from the suits that "if there's a managed version, use it" but most of our teams are thankfully smart enough to take that at face value and do their own analysis.
As far as I'm concerned MSK is cheap - one broker is priced roughly same price as 2 equivalent EC2 instances. And you don't have to worry about zookeeper at all!
For us our pipeline was actually easier to work with Flink than Glue because of the restrictions that Amazon placed on it and so that factored into our decision.
The advantage of Glue or the corresponding serverless GCP ETL option (dataflow) is that it's serverless elastic, but it sounds like their workload wasn't applicable.
Admittedly there’s difference between optimizing fully-controlled resources and cloud provider managed services. For one, low visibility into cloud service internals makes such optimization harder.
Cloud services gave new options for variable use and reallocating management costs but they also did something which most places were not used to: expose every detail as an itemized bill. That makes costs more visible than they’d been for most organizations which is good in the sense that people can make architectural decisions with pretty detailed numbers but bad in that many CIOs get sticker shock unless they’d done a well above average job calculating on-premise TCO.
But the problem with AWS, with a lot of the "cloud", is the pitch that remote centralization of a service scales ad infinitum. It's still subject to the same constraints as self-managed, even if those constraints appear at a higher limit.
The greatest constraint is the per-unit pricing. You buy self-managed, you have huge upfront and period costs, but with remote, you see the $.03/MB price and assume that variable cost is more manageable over the long run. And it is... until price changes, overhead changes, bandwidth changes, or worse, accessibility changes. And suddenly, what you had cost-effective scaling on 18 months ago now has a massive deficit affixed to it. Because that's how most people used the platform... or because removing A or B features reduced maintenance costs or freed up bandwidth.
AWS is an experiment. Does it work in many or even most use cases? Yes. For now.
I love engineers. A lot. In fact, being in sales, I would give up a deal with an engineering team unless I knew for sure my ROI basis was solid. That said, I do know sales and marketing rhetoric. And having spent hundreds of hours in meetings with product, marketing and dev professionals, I wish I could record the stress-induced breakdowns I've seen in engineers and executives who had everything running buttery, "and then [provider] pushed [update]..." and they then have executives breathing on the back of their neck 16 hours a day, entire teams offline or unable to do basic tasks, etc. I just want to play that shit to people and say, "This is why you don't overpromise."
A 67% reduction doesn't say the whole truth. They have more services to manage now, which means they need more people and more time to do this.
Saving 10k from your AWS bill by hiring 2 more engineers is not cost effective.
Where did we get the idea that engineers are hired to do only one thing?
This has never ever been the case in my experience.
Also Kafka being hard this manage is not the case. A simple look into many small companies and startups running their own clusters shows otherwise.
I also know many startups and small companies investing 5 people and 6 months to get an observability platform up and running while they could just get datadog or new relic for half the price... and I don't get into account outages and updates to the platform.
I remember a recent uber blog post on how they moved from build tool A to build tool B and a couple of weeks later, 3000 people where laid off. It's important to spend development time on revenue streams.
This is some nice piece of advice https://nav.al/build-a-team-that-ships
"Outsource everything that isn’t core. Resist the urge to pick up that last dollar. Founders do Customer Service."
At a certain size or number of self run services, they very well might be. I used to be the guy that did the set up for these sort of self managed solutions, and ran them day to day. In some shops the workload was high enough we needed multiple people like me doing it. Or a whole team. Doing DevOps style management of them just let us do it with fewer people - it certainly didn't make it feasible for developers to do the day to day management of these services and still write code.
Haha, so cleaned the internal IT / DevOps mess and call it a day and than blog post it
The whole point of AWS is to use services on demand; it's like buying 133 conference tickets for your 100 person company.
If I make a .NET service or site, I know (with the tools I use) I can deploy it on any linux or windows machine without issue. I can take it anywhere that I can run any software.
Sure, may need more glue for certain scenarios, but you know that you can move as soon as a provider shows it's fangs.
I'd really like to start seeing a series of blog posts from companies who are running extremely lean and efficient tech environments by utilizing cloud in an intelligent manner and avoiding the expensive and unnecessary bullshit that's so prevalent today. The ones that can brag "How we run a $4M/yr SaaS on $40k/yr of AWS spend!" are far more interesting than "How we stopped incinerating millions of VC money by simply turning off shit we didn't need"
Maybe the blog post would have been "How we run a $1M/yr SaaS on $40k/yr of AWS spend!" instead of $4M?
ie If working on a new product or feature to understand upfront "this managed service is x% more then more bare bones" etc.
essentially turning an alchemy into a science
I think a lot of people make the mistake of assuming AWS is just an easy off-the-shelf thing you can just grab, but if you use it seriously it's a full-time job and its own expertise.
Source: I've done some AWS certifications, never was able to put them into practice though. I've also worked in multiple organizations that migrated to AWS, they all had a full-time team of people managing it.
It's a full-time, specialist job and you can't just palm it off to your engineers as a background thing.