But what's wrong with Twitter?
I'M GLAD YOU ASKED. There are two aspects of Twitter that just bug me as an engineer:
Ruby on Rails - Using rails to prototype a system is fine — scaling up to a million hits a day with it is just a bad idea. As the service grew, I'm sure it cost them a lot more time than it saved.
140 characters is not enough - I routinely write sentences longer than 140 characters, so I can't even begin to imagine making a point in such a small space. This textual confinement has led to the rise of URL shorteners, which are breaking the internet.
Blërg solves these problems by applying absurd reactionary engineering. Blërg's database backend is a custom C program that handles requests over HTTP and stores data in a very small and efficient indexed log-structured database. The frontend is done entirely in client-side Javascript. A single post can be up to 65535 bytes in length.
Which is not to say that I believe writing your service in C is the solution to all your problems. Clearly, this approach has just as many hairy problems that will bite you in the ass sooner or later. The best way, as with most things, lies somewhere in the middle of high-level abstraction and ZOMGHARDCORE OPTIMIZATION.
Or more politely described in the documentation page: Blërg is a minimalistic tagged text document database engine that also pretends to be a microblogging system. It is designed to efficiently store small (< 64K) pieces of text in a way that they can be quickly retrieved by record number or by querying for tags embedded in the text. Its native interface is HTTP — Blërg comes as either a standalone HTTP server, or a CGI. Blërg is written in pure C.> Curious? Click this unbelievably obnoxious button!
> Blërg is a microblogging platform. Or maybe a miniblogging platform. Blërg is not sure.
> Blërg's author finds it entertaining to anthropomorphize Blërg in the third person.
This website, the documentation, the text from the comment you replied to, everything is just clearly humour.
The sad thing is that way too many developers never learn that programming language and framework is not the biggest bottleneck.
"""Is this a joke?
Yes. No. Maybe. Blërg is an exercise in constructive satire — a fully functional service created in a fit of hubris to poke fun at Twitter's engineering. It's just for fun, but no one is going to keep you from using it seriously. :]"""
The idea is to stop people posting the first thing that pops into their head: hopefully taking more time to formulate a response (or reconsidering if it was valuable to say anything at all).
So if we can detect you said something about climate change, you see a little popup asking you if water boils at 100 degrees Celsius or Fahrenheit? If you don't get it right, you are no longer allowed to post or reply to climate change topics.
Did you just say Trump was the BEST or WORST President in History? I'd like to see if you can name 5 Presidents before 1980? If not, I don't think we need your opinions on historical ranking.
The tests wouldn't be hard. And they would easily be Googleable, but there would be 10 second time limit for them to be completed.
It would probably be the least popular social media company ever created. This is also a terrible idea, because the users that are entitled to post, probably would not be humble and might be jerks.
I'm sure other projects have started something like that, but I can't find them
1. You'll have to fuck around with router settings to forward the incoming traffic to that piece of hardware.
2. You'll have to have a domain, and fuck around with dynamically changing DNS settings because your IP is likely not static (this process depends on which DNS provider you choose to use).
To add to the complexity, you'll probably have to have some sort of a VPN because otherwise anyone would be able to see your public IP by just pinging your domain.
Point being: you can run Pleroma right now on a Raspberry Pi no problem, but doing so securely is another issue altogether.
I believe that Twitter runs on the JVM since almost a decade.
Nothing in Rails prevented them from doing that properly, though Rails defaults may well have encouraged them to pick the easy way out initially (that in itself would not have been a problem either, if they'd started work on a proper backend in the background).
Sharded large-scale delivery of messages has been a "solved" problem for decades, irrespective of language. Twitter as-it-is now is big enough to have all kinds of fun operational issues, but when they were struggling their volume was nothing impressive.
So long and thanks for all the barfs. :)
Has it been running on a single node all these times? What are the hardware specs? How many requests per second did it handle? Has it gone down because of the load? What was the bottleneck (network, cpu io)? How did it fail (crash, slowness, fire ;-)? Do you have monitoring data to show? How many twitter-like active users would it be able to serve from that single node?
Thanks, and Blërg! :-)
$ git clone http://git.bytex64.net/blerg.git
Cloning blerg...
error: Empty reply from server (curl_result = 52, http_code = 0, sha1 = 95d56e6cc [...])
[...]
Sigh! How can we look at the C code?[EDIT: now looked at other comments about how this is a joke... is this error intentional?]
Here's something I never thought I'd say: Perl made breaking changes. My guess is additional strict things broke the gitweb CGI script (which was hopelessly out of date anyway).
I'm planning to build a "Microblogging for companies"/"Slack for microblogging" SaaS this year because I think asnyc communication is the way for remote work.
Currently, I'm looking at existing microblogging platforms to get an idea of common features.
Next step will be interviewing potential customers.
The scaling problem is what AWS's James Hamilton use to call the "Hairball Problem" in which there is no single database partitioning scheme that performs well under all workloads. The Subscriptions [1] section of the documentation demonstrates this problem:
> I immediately came up with the naïve solution: keep a list of users to which users are subscribed, then when you want to get updates, iterate over the list and find the last entries for each user. And that would work, but it's kind of costly in terms of disk I/O. I have to visit each user in the list, retrieve their last few entries, and store them somewhere else to be sorted later. And worse, that computation has to be done every time a user checks their feed. As the number of users and subscriptions grows, that will become a problem.
> So instead, I thought about it the other way around. Instead of doing all the work when the request is received, Blërg tries to do as much as possible by "pushing" updates to subscribed users.
This approach works well except when some users have hundreds of thousands of followers which results in an explosion of "pushing" updates. This doesn't even take into account geography related latencies of a global user base. Furthermore, naive attempts at databases without fault tolerance assume that losing recent data is not a big deal. It isn't, the big problem is failures that corrupt data and can't be restored to a known good state.
Your analysis of Blerg's analysis of Twitter's technical shortcomings is facile. It's a system that the author put together while intoxicated to poke fun at Twitter. Reading any more into it than that is at best highly unadvised.
(Source: Used to live with Blerg's author)
The HN title is "a microblogging platform", the landing page has a section titled "But what's wrong with Twitter?", and the documentation has a "Design" [1] section. If the author intended to poke fun at Twitter as a stand-in for all microblogging platforms then leave out the engineering aspects and if the author intended to poke fun at Twitter engineering then land some technical blows. If the point of the exercise is to demonstrate how much code and documentation can be pumped out during a drinking binge then the author should state that upfront; that context is useful.
Butt-ugly interface? Check!
Mentioning the technology used to build it as it were a feature? Check!
So a typical tool created by a developer who thinks the average guy think and act like them.
...what?