First of all, how does it persuade you of that? The article touches a really small (though incredibly important for up-time) subject.
Secondly, in any large company, the majority is 'bloat'. It's security engineers, code reviews, data architecture, HR, internal audit teams, content moderators, ccrum masters and I can keep going. In a start-up many of these roles can be ignored, becaus growth > stability. In a large organization, part of the bloat helps insure a certain amount of stability that's necessary to keep an organization alive.
If a product is mature enough, like Twitter seems to be, removing engineers won't instantly crash the product. It'll happen slowly. Bugs will creep in, because less time is spent on review and over all architecture. Security issues will creep in because of about the same issues and less oversight. Then, once this causes enough issues for the product to actually crash, the right people to fix it quickly might not be there anymore. That's when fixing the issues suddenly takes a lot more time.
If the current state of affairs at Twitter keeps up, it'll probably be a slow descent into chaos. Especially with Elon pushing for new features to be implemented quickly, inevitably by people who cannot fully understand the implications of said features, because 80% of knowledge is missing.
By flowing from many people think it's bloat - I'll tell you what's really going on to tiny team of 1~3 built whole infra for critical component.
I'm not really trying to make commentary on whether or not Twitter engineering was bloat, or whether or not I think it'll hit problems in the future. Just commenting on the fact that the article broke my expectations a little bit as a reader.
There's no doubt that OP built a great and stable automation layer on top of Mesos for caching workloads. But there are numerous other types of workloads on top of Mesos (including, I presume mission-critical database deployments that need well-disciplined draining protocols to shift between nodes), as well as administrative needs for the Mesos-to-infrastructure level, and things running on bare metal below the Mesos level. These things all needed dedicated SREs, and the absence of these SREs could result in a scenario like the one mentioned in the Twitter thread I linked - two obscure mutually-dependent components expire and cannot be re-provisioned using documented tools.
I also think an important meta-point is that when Twitter was bringing in substantial revenue from advertising, every minute of downtime would have significant costs - costs that could make it easily worthwhile to "over-provision" SRE talent. With advertisers pausing engagement, perhaps Twitter loses less money from a day-long outage than it would save having the right talent to turn a day-long outage into a minutes-long outage.
Twitter is only judged by its profitability (namely, Musk's ability to service debt without selling more Tesla stock than he already has), while most other tech companies (both public and private) are judged by both profitability and revenue growth. If you want both, larger SRE teams, to say nothing of feature development and regulatory compliance teams, start to make a lot more sense.
It also (a) increases the bus factor, [1] and (b) allows people to take vacations and time off without having to watch their phones like hawk.
It's not about having enough people to do the work even if someone quits, it's about having enough people that know how to do something that we aren't losing chunks of knowledge if someone quits (or dies, or gets fired, or gets sick, or etc).
It doesn't make sense to me to treat people as part of a conduit bus that are interchangeable as long as there are enough people.
It's amazing to me how many people following the Twitter saga, some familiar with or actually working in technology, thought that Twitter would crash within days of the engineers being fired. And because it didn't, the job cuts are justified.
With that said, there are differences between internal systems and something like Twitter on the public internet. I assume that Twitter is a system under constant attack. What happens when the next log4shell level vulnerability comes out?
The car analogy is amusing, but how much does it really hold up? Have we ever seen another major social media company drop this much of its staff in one go? I certainly can’t think of an example. I think we’re in somewhat uncharted waters here.
A driverless car won’t last long, we know that for a fact. I think it remains to be seen how long a bloatless twitter can last. I’m personally optimistic.
This is an excellently apt analogy, in light of Twitter's new owner.
Unless the company that creates it is owned by Elon ;)
Because they work for companies where the product would fail within days of them being fired themselves.
For the type of jobs at hand here, One of the things I learned is that nobody is essential. Even that person you think is essential.
Of course it could go either way but the jury is currently out. It’s entirely possible that severe company-impairing technical breakdowns are already in progress and unrecoverable.
Or maybe not.
On the other hand. As an engineer, we tend to attach way too much self importance to our roles. Like if we're not there entering the "numbers" 4 6 15 16 24 32 every 108 minutes, the entire business is going to crumble. So... this is one I'm going to watch with a keen eye.
Never I have encountered an engineer that thought that.
It's not like Twitter was bug free before. How many times it annoyingly refreshed the timeline while I was reading something, or when it shows notification that it failed to send the DM, and when you retry it says "you've already wrote this", or you open the reply dialog, but it freezes, has no send button at all, so you have to re-open it. All of this was happening to me pretty regularly long before Elon came along.
As we all know, just hiring more people is not necessarily the solution to every problem, and to me it seems it was exactly what Twitter tried to do in the past. Now they deconstructed it to the bare bones, which will clearly show what are the core problems and requirements. They basically turned Twitter back into a startup. And from that new starting point they can hire again to cover the needs as they arise. If they succeed it will be a huge success as they'll end up with far more optimal team (and huge savings), and of course, if they fail to catch up with problems it will be a huge failure. We'll see how well Musk can manage it...
Anyone thinking "huge success" is unrealistically optimistic, IMO.
It's going to be another MySpace/AOL/Bebo, added to the list of dumbest purchases ever.
And that's still going to be true if the point was to destroy the original community and replace it with a different political orientation.
How many millions are in a billion?
the power law applies to any big organization. 20% of the people do 80% of the work, whilst 80% of the people are just there for "support".
whatsapp was run by a team of like 20 people or something when they got acquired for $20 billion. for a simple software product, you don't really need that many people. in fact, more people often means bad software. you just need a small group of very talented engineers to run the product and add new features when necessary.
big (and especially public) companies often times need to hire a lot, just to look like a real company.
now that twitter is private, elon has no responsibility to public investors and can focus less on looking like a real company and more on doing what needs to be done to cut bloat/costs and improve product
https://twitter.com/IlluminatiGanga/status/15946097904324444...
new members joining in 1970. hmmm.
That said, there are lots of bugs in Twitter now, today, when they presumably had the benefit of being in stable mode for a long time. For example, Twitter regularly refreshes and loads new tweets while I'm reading them, pushing the tweet I was in the middle of reading out of view. That seems like a pretty silly bug to exist in a mature product. I regularly reach a state where I have to kill the app and relaunch it because all of the "back" commands just minimize the app instead of taking me back to the timeline. I could go on.
But regarding the bugs, I’m totally with you. Same here. I use Twitter only in the browser. Browse long enough and the page reloads as if it ran out of memory.
Have you implemented a system which stores hundreds of billions of pieces of media content and makes different slices of them immediately available to hundreds of millions of users?
I'm currently the first and sole architect on a product that was built by only devs. I really know why I exist.
Where’s the 1000s of engineers for Postgres? Most stuff that works is made by a handful of people. Look at io_uring it’s basically one guy at Facebook…
You're comparing Postgres to Twitter?
If some of the people making comments like this actually work in tech, then yeah, maybe there is a lot of bloat to be cut.
You focused mostly on additive bloat, there's also multiplicative bloat in the form of multiple teams focused on building separate versions of the same product to increase likelihood of success and empire building where leaders don't actually have a remit large enough to support the team size they have, but they have woven a narrative that defends the necessity nonetheless. Put everything together and teams are very easily 6x+ larger than they absolutely need to be to get a product into market.
Please tell us in detail about the Twitter stack.
Because I always find it fascinating how people think they can estimate the effort to maintain it whilst having next to no understanding what so ever of the tech stack.
1. A single person can run a mastodon instance in their spare time. Spinning up some containers for the app, a background worker and a database is quite simple.
2. Modern devops tooling makes it fairly trivial to spin up 10k instances of a container instead of 1, by just altering a number in a k8s manifest somewhere.
3. Ergo, a single person equipped with modern tooling (and sufficient funding) could spin up any number of mastodon instances.
4. Twitter is just a big mastodon instance.
5. Now that keeping everything up is sorted, add another 99 devs for feature development and you are done.
Now this is obviously faulty logic because points 3 and 4 are very false, but they look reasonable enough at first glance.
THEY could probably do it with 100 people, YOU cannot.
100 people is most likely within the ballpark for a group of people whose sole purpose is to write and maintain twitter's tech stack. Unfortunately, that is not NEARLY the sole purpose of most people in businesses and that adds all kinds of productivity hits.
What happens is that people like yourself become convinced that's the only way to operate.
Likewise, bringing in Ad money would be a few more hundreds, because you need to chase leads in all countries.
Getting the Ads to work? That's tech and I'd be surprised if it was less than 100 people, too.
I subpoena telcos all the time. My sense is that the number is closer to 2 to 3 dozen.
Maybe the user facing site, but that's just the tip of the iceberg.
There are plenty of internal/backend/restricted systems to support and/or monetize this part.
And that's not counting the huge number of support people & moderators needed.
Making it globally available and legally compliant, that's where the next few thousand folks come in.
The people shouting loudly about how Twitter must have been so bloated are really just shouting their obvious inexperience working at global scales or their localized ambitions.
Could there be too many employees at Twitter? Sure. Most companies have dead weight. The number who were "extra" is probably not 9/10ths the employees though.
This is because you don't see the complexity. What you see as a Twitter user is a fraction of what's actually there.
You have to build a platform for ads. Not just serving ads, but allowing advertisers to prepare their collateral, preview them, get their results, and be billed. So that's an entire content and invoicing platform separate from your main feed.
And since your platform is all user generated content, you've got to build a moderation pipeline. A place for users to make reports, but also an interface for your content moderators to view content and make decisions. Oh, and while you're there you'd better build a portal for law enforcement to make data requests, along with your DMCA takedowns. Oh yeah, DMCA - that's another whole thing you've got to worry about.
Then the EU comes along and needs you to build something to support your GDPR obligations. Then India wants something similar, but only for its citizens. Your users also want verification, so better build that platform for securely verifying accounts and awarding checkmarks.
It snowballs. Was Twitter's engineering group bloated? Probably. Most large companies are. Could you run the whole Twitter tech stack as it exists today with a hundred people? Absolutely not.
Separately, some commenters here are flatly delusional about the effort to ship a site, android and ios apps, internal mod tools, help docs, support, and legal docs in 34 supported languages. Not to mention obeying laws in all the countries that implies.
Or image and video hosting! With recoding of videos, resizing of images, and the management of what is surely petabytes of images and videos with very high reliability! That is not a 1, 2, or 3 person job to do well.
"20 with cloud, 40 without. So much overlap between iOS, Android, and the web, three people can do all three. More for the backend." https://twitter.com/realGeorgeHotz/status/159371372367535718...
What we are watching is a massive failure event right now and the question really is if there's enough time for twitter management to fill in the gaps before there's an outage.
That's how it couldn't prove the claim.