It's also interesting to see different companies approach this problem differently - Facebook famously recreated a way to run their PHP source code (by compiling it to C and then running it natively) instead of actually rewriting the source to a different language. I wonder if something similar would have been possible for Twitter, or if they weren't happy with how their existing code was structured in the first place which may have made the rewrite more attractive.
Why do I say that? Because in Twitters case, handling tweets is a "trivial" problem to parallelise, and the potential savings of switching language will be a rounding error in terms of scaling their system compared to getting the architecture right.
(To scale Twitter: Make trees. Split the following list into suitably wide trees, and your problem has now been reduced to an efficient hierarchical data store + efficiently routing messages. Both are well understood, easy to scale "solved" problems)
There are many dynamic user driven sites which have scaled well (far less downtime than Twitter) without switching to static compilation.
Also, Twitter doesn't have Google's infrastructure.
In regards to "static compilation", that's not the important bit, but rather the performance of the virtual machine. The JVM, at its core, is not working with static types. The bytecode itself is free of static types, except for when you want to invoke a method, in which case you need a concrete name for the type for which you invoke that method ... this is because the actual method that gets called is not known, being dispatched based on "this", so you need some kind of lookup strategy and therefore the conventions that Java uses for that lookup are hardcoded in the bytecode. However, invokeDynamic from JDK 7 allows the developer to override that lookup strategy, allowing one complete dynamic freedom at the bytecode level, with good performance characteristics.
The real issue is the JVM versus the reference implementations of Ruby/Python. The JVM is the most advanced mainstream VM (for server-side loads at least).
Unfortunately for Facebook, they didn't have a Charles Oliver Nutter to implement a kickass PHP implementation on top of the JVM - not that it's something feasible, because PHP as a language depends a lot on a multitude of C-based extensions. The more pure a language is (in common usage), the easier it is to port to other platforms. Alternative Python implementations (read Jython, IronPython) have failed because if you want to port Python, you also have to port popular libraries such as NumPy. Which is why the PyPy project is allocating good resources towards that, because otherwise nobody would use it.
what I find even funnier is that people just throw in play and grails without a little of experience in rails. The eco system is entirely different:
- you can use java and therefore java libraries(yes you could do the same in jruby, but nvm) - bundles/gems are an order of magnitude better than classical java dependency hell. - need something in rails? add a gem. need something in grails? search throw the outdated plugins, search for missing documentation(just take a look at stackoverflow). in general -> write it yourself or pay a consultant to do it. - have a question? fight with incomplete documentation.
on top of that grails is just a stack on top of the spring mvc.
what about play? i like play more than I like grails tbh, because it doesn't want to be the rails of java. yes people compare it to one another, but that's just the familiarity effect.
now, where's the computationally intensive stuff? nowhere to be found. it's a web api. where's the computationally intensive stuff in twitter? I don't know, but chances are theres a native extension for that nowadays.
There actually was a time when you simply could not build a scalable system in ruby without too many hoops, but that's no longer the case. Yes the GIL is bad, but keep in mind that things like ruby fibers didn't even exist at the time.
My memory might be playing tricks on me, but I remember play developers talking about rails being a huge influence.
If Play isn't rails for Java, how else one does a Rails for Java? The "familiarity effect" is there because Play is modeled after Rails.
> now, where's the computationally intensive stuff? nowhere to be found. it's a web api. where's the computationally intensive stuff in twitter? I don't know, but chances are theres a native extension for that nowadays.
Search, for one is computationally intensive. I am pretty sure there are more which the outside world doesn't know about.
> There actually was a time when you simply could not build a scalable system in ruby without too many hoops,
What scale are we talking about? At Twitter scale, ruby or anything else has to jump hoops. For example, network load from users coming online and offline on FB chat will break out of box solutions.
This is not unlike those stories where a developer writes a trivial program in a new language that is similar to a trivial program they wrote in another language 1+ years prior and compare the results. "I wrote program foo in 30 lines of code in language X, which is much better than the 120 lines of code it took me in language Y two years ago". It's natural for a developer to write a shorter program two years later since they have 2 more years under their belt. In fact I'd expect the same program to be much better two years later even if written in the same language.
Fine, but what I wish I understood was why RoR is so popular to begin with. The claim is always that Rails is so wonderful that it's worth learning Ruby just to be able to build web apps with Rails. Well, if so, then why isn't there a sequel to RoR: Python on Rails? "The benefit people care most about (Rails, not Ruby) using the language you already know (Python)." Since Python, Perl, and Ruby are so similar except in syntax, and people love the Rails part more than the Ruby part, and so many more people know and love Python, and new Python web app frameworks appear all the time...why isn't one of them Python on Rails after years of Pythonistas being forced to abandon Python and learn Ruby just to be able to use Rails?
Is there some significant difference between Python and Ruby that explains this? What makes Rails so attractive and why isn't the same thing done with Python?
If you get to the scale of twitter, and more importantly if you have written a real time messaging server with a web application framework, it doesn't matter which framework or language you started with as you're going to be completely rewriting your entire stack at some stage as different parts fail in order to deal with the load (unless you have an incredibly experienced team who has written a twitter equivalent before and scaled it to 15000 messages a second). There's no difference between ruby or python (or perl, or php) in that regard - they are all interpreted and relatively slow, which starts to matter at this scale. And also even if you had used java or c in the fist place, if you had an architecture not written with massive scale in mind, you probably wouldn't survive that growth without radical changes on every level of your stack.
Re ruby versus python, the popularity of rails is partly historical accident, partly that ruby is a nice language which doesn't get in the way and is ideally suited to this domain, and partly that rails is deals with lots of the basics of web development for you without getting in the way too much when you need to adapt it. None of that means ruby is better than python, but I'd disagree that people put up with ruby in order to use rails - it's a really nice language in its own right, but it is not highly performant (though it is getting better). For most websites of course, many of which can employ caching, that is a non-issue, even at large scale - witness the success of Wordpress in php which would not survive even modest loads without caching.
http://www.oracle.com/technetwork/java/javase/gc-tuning-6-14...
Ruby and Python as languages are not radically different in kind, but their respective developer communities have had different focuses, and as a consequence the library of tools are not identical.
Ruby on Rails became popular because people were dissatisfied with the way that web development was being done, and DHH is very good at marketing/propaganda. Ruby on Rails has had a substantive effect on the way that web development is done, and there have been numerous attempts in other languages (that didn't already have a framework like Django) to recreate the things that Ruby programmers enjoy with Rails (CakePHP and Grails come to mind immediately).
To elaborate a bit, Rails tended to have a lot more generated code (the fabled "magic") that really sped up development of your standard CRUD apps. As far as I can tell, this was fairly novel in web development. Django didn't really have a focus on that, you spent a little (lot?) more time configuring. It's better now though.
Of course, at the complete opposite of the spectrum, you have very minimalist stuff like Flask (Python) or Sinatra (Ruby) which is doesn't include a lot of bells and whistles. You'll have to import your own ORM, templating, etc...
When Rails was introduced, there were no other substantive Ruby web frameworks - Ruby itself was relatively obscure compared to every other web-capable tech. As such, it (Rails) had no competition in the framework space. PHP, Java, Python and other languages all had competing web frameworks to choose from.
I'm not sure that's the case. Perhaps had they started out by trying to scale to their current levels they would never have gotten off the ground. When you start developing a system for a new company your largest obstacle is almost always lack of product/market fit.
Twitter hasn't phased out Rails, it has replaced background services written in Ruby.
> Fine, but what I wish I understood was why RoR is so popular to begin with.
That is hard to explain. You will have to try it out yourself and compare with whatever you are currently using.
> why isn't one of them Python on Rails
As other responses pointed out, Python has Django. And no, Django isn't rails inspired. Django was developed independently. Also, the languages and philosophies differ significantly to have a Python on Rails.
I'm a relative newbie to the Ruby world, but as best I can tell, the Ruby and Rails communities both accepted long ago that they weren't made for Twitter levels of traffic.
Fact is, almost no one has Twitter levels of traffic besides Twitter. That's why Ruby and Rails are still so popular, because for ~99% of performance needs, they're more than capable and also extremely pleasant to work with.
Seems like a no brainer that something that is a lot more strict and static will outperform something that has to deal with many more possibilities at runtime.
Now, now, it's The Register. The tabloid style is part of its identity.
The wise programmer says: "Use the right tool for the job."
Please HN, we're wiser than trending stories would suggest.
If so, it's impossible for a language designer to do a good job, because it's impossible to improve a programming language.
The "no free lunch" theorem would indicate to me that there is no ultimate language, merely languages that are better for common (to you) use cases.
I think all languages have strengths and weaknesses that matter based on the context.
For example, I hate manually managing memory in C/C++. Java, Ruby, etc automate all of that for you, but it comes at a cost of using a lot more memory. That's not a big concern for many applications, but if you're doing embedded software or real-time systems it can be a deal breaker.
So in this example, is C++ better or worse than Java? The answer is that it depends what problem you're trying to solve, what environment you're running in, and what resources you have available on your development team. Personal preference matters too in terms of programmer satisfaction, but just because you like one language's constructs more than another doesn't mean it's well suited to every problem space.
As to the original article, I think it's great to learn about how different companies change their software stack, but not because "w00t Twitter hates Ruby - Java rocks"; rather it's interesting to see how business context changes over time and the implications that has on the technology. Seeing how other businesses have dealt with these hurdles can help you keep an eye out for them in your own business. I just wish that this was the lens that it was written in rather than "OMG - Ruby = Fail Whale."
If you're a language designer then it's good to hear about the complaints and preferences of programmers so that you can design a better language (and thus a better tool). And like any tool, there will be times when it's better to use one language rather than another.
So I think the (even just sometimes) discussion of "good language" should be limited to the context of purpose and goal of the said language, which is similar to "choosing the right language for the task".
Ideally, every person would have their own programming language (or culture), grown from their own personal experiences and desires. In that case, a qualifier for a good language is one that most easily allows expression of new languages or cultures at the individual level.
For a language designer, this could be an impossible problem due to the unique learning models of each individual. Their best hope may be in attracting those whose mental models have some overlap with their own.
Well, since "improve" is a relative and a matter of taste, it is impossible to improve a language, isn't it? Make it run faster, and it will either take up more memory or be slower to develop with. Make it faster to develop with, and it will run slower or have some other tradeoff.
Hence, "use the right tool for the job". And when your focus shifts from "quickly building a product" or "quickly adding new features" to "rock-solid stable and fast", then you need to switch tools. Simple as that.
(They do use a bit of regular Java, but the majority of the core code is Scala, and it's Scala that should be getting the top line credit here, not Java.)
The truth is, Scala is actually helpful with the architecture they came up with AND its great that it runs on the JVM.
I've always admired rails for its flexibility and its enormous productivity boost, but all my serious applications are coded in Lift. I for one believe in "develop and forget", because I'd like to call myself a business guy than a programmer, though I'm deep into both. I like to spend more time expanding/marketing my business than worry about scaling it. But that's just my perspective.
JVM is terribly under-estimated and I realized this when I got started with Lift+Scala. Scala is a very powerful language and requires a totally different mindset (=functional). And Lift is fairly complex for those wishing to get started with it and has poor documentation, despite being a 5-year old Framework. But once you understand it fully (somehow), there's no looking back. Lift provides so many things out of the box, especially related to security (unlike PLay!), so it's kind of a trade-off you have to choose between. Even if you compare all the benchmarks, most of the JVM-based languages like Scala outperform even something like GO! (Ok, that's not fair, since GO is fairly new)
If you're interested in Scala, Coursera has a course on it by the creator of Scala himself (Martin Odersky).
If you're truly more interested in the business side, why don't you build it as quickly as possible in rails/django and then later if it warrants it you can hire some people to build it in lift/something else?
"Last week, we launched a replacement for our Ruby-on-Rails front-end: a Java server we call Blender. We are pleased to announce that this change has produced a 3x drop in search latencies and will enable us to rapidly iterate on search features in the coming months."
http://engineering.twitter.com/2011/04/twitter-search-is-now...
This articles seems more like link-bait to me.
But in any case, if Twitter's architecture is truly scalable then any intrinsic slowness of the language shouldn't be a big problem, because they can just toss more hardware in to compensate. What is a problem is a buggy VM that leaks memory. To run thousands of instances in a heavily instrumented way, the VM must be stable and predictable.
I think their decision was more influenced by the experience of their team of programmers or by employable talent pool - they went the JVM|Java|Scala route because they had people with experience in high level languages and the JVM. If they happened to start with a team with "C hackers" background they would've gone the Ruby|C way and it would've worked as well.
...I think almost any language and technology can (be made to) scale, even to Twitter scale, at least if it's open-source and you have people with the required technology to hack around the internals and recode performance critical parts in lower level languages (basically C, C++ or Go nowadays)
http://engineering.twitter.com/2011/03/building-faster-ruby-...
While most won't ever encounter the issues Twitter encountered using Ruby and Ruby on Rails these kinds of articles are very damaging for the Ruby language and Rails framework because even though I primarily still use PHP, Ruby & Rails are something I have a vested interest in as well and this will no doubt push potential newcomers away from the language.
Twitter has to deal with very high fanout and low cacheability.
I would have thought that you could just throw cheap hardware at that problem, whereas the database would be a considerably complex scaling issue.
But it is commercially solved. If you turn up with a fistful of money, Oracle, IBM, Sybase and a bunch of other companies would love to handle that for you.
History will remember the entire Ruby industry as a series of compounding failures.
The de facto formalisation and specifications.
The black-box behaviour of core development.
The broken-linked, un-versioned docs.
The rampant cargo-cult mentality.
The arcane exceptions.
The meta-frameworks.
Gem hell.
1.9/2.0
Rails.
No technology stack is perfect, but I've yet to meet a stack that was pure evil. There may be some cargo-cult personalities in the Ruby community, but, if there is, it's only because there is value in it.
An infinitesimal number of sites have to deal with Twitter's scale problems. The rest can work on getting crap done instead of worrying about Maseratti problems.
Cargo-culting is what you accuse others of when they are learning and you do not like them.
The appropriate way to deal with newbies who do not fully understand the consequences of the decisions they have made (perhaps even while they are advocating others join them), is to explain their rhetorical and technological shortcomings in a way that others can learn from. Accusing someone of "cargo-culting" is just unhelpful character assassination.
Also, weak-sauce.
As someone who has seen Ruby cause major problems I'd have to say it's not too far off "pure evil"
This is a bit dramatic. Despite its faults, the language and community around it have been quite a success story. It's had a huge and positive influence on the web development world.
The Ruby community, really really cares about developer tooling and teaching others. Ruby is also one of the nice places where object oriented programming touches metaprogramming.
Don't forget that a lot of sysadmin work is done in Ruby now. Both Puppet and Chef are written in Ruby, and that Github and Heroku both came up out of the Ruby community.
Is this really a problem? Every package/dependency manager seems to blow up on occasion. Gems haven't given me the problems that I've had with Pip/CPAN/Autoconf.
There are things more funny to do than solve bugs introduced by a backported patch.