Update:
Looks like passenger in development mode. Good job you benchmarked a web server that no one uses wile reloading all code between requests.
Update2:
Ok it seems to run in production mode but still, passenger is not an idiomatic choice.
They were all run in production mode with logging disabled, etc.
http://polycrystal.org/~pat/scratch/microbenchmark.png
Note that a difference from 10k requests per second seems huge compared to 3k, but if you invert it, you get 100 and 333 micro seconds per request, respectively. In a real, non-"hello world" app, these differences are going to be negligible.
Though perhaps it would be more interesting if instead of just responding "hello, world", the app parsed some query parameters or something. But I was mostly interested in the overhead of different JRuby servers, not comparing different app servers (i.e. overhead from Sinatra should be more or less identical whether you're on Puma, Trinidad, or whatever).
Our understanding is that when running Passenger, simply passing '-e production' to the command line is sufficient to run in production, but if that's incorrect, we'll gladly update the test.
"You configured framework x incorrectly, and that explains the numbers you're seeing." Whoops! Please let us know how we can fix it, or submit a Github pull request, so we can get it right."
Perhaps you/we should submit a pull request?
> rvm ruby-2.0.0-p0 do bundle exec passenger start -p 8080 -d -e production --pid-file=$HOME/FrameworkBenchmarks/rails/rails.pid --nginx-version=1.2.7 --max-pool-size=24
https://github.com/TechEmpower/FrameworkBenchmarks/blob/mast...
However, if you are interested, you can always check out other PHP frameworks (Yii, slim), or if you like all the bell and whistles from Cake and are willing to dive into another language, you can always experiment with ruby (ror, sinatra), node (express) and python (flask).
Similarly, relatively modern deployment standards (not really cutting-edge) like nginx and php-fpm OR apache 2.4 with worker mpm + php-fpm should be added to the mix.
If you have existing sites built with it which work fine, then this benchmark doesn't tell you a great deal other than that php is relatively slow, and frameworks built upon it slower. For many websites which have a caching strategy sitting behind a server like apache or nginx and less than a few thousand users a day, this really doesn't matter, and other things like features are more important. This hlds even for bigger site too - Facebook for example still runs PHP (compiled to one big binary). Personally I wouldn't use PHP because of the language/std lib but this sort of performance shoot-out should not put you off it.
I always strongly advise not using CakePHP, purely because its model system is so broken, it makes my teeth itch.
(The big four in my book are: ASP.Net MVC, Rails, Django and CakePHP)
We did briefly test ASP.Net on Mono (see another comment in this thread) but didn't include it since we didn't believe that qualifies as a "production" grade ASP.Net MVC deployment.
It pains me to see charts and reporting done like this while leaving out my favorite framework.
Drop me a line if you need a hand with either :)
If you have any questions or see something we stupid we did, please let us know. We'd like to correct any mistakes straight away, especially since we're certainly not experts on all of these frameworks and platforms.
https://github.com/rails-api/rails-api
What webserver were you using on JRuby? Was it Trinidad? Did you try Jetpack?
I wanted to find the leanest Web framework on any kind of platform; but the difference from your approach - I already knew the kind of code that would run on it.
I tested: Go, Java (servlet, dropwizard), Scala (scalatra), Ruby, Node.js (connect).
For me it was:
* Scala
* Java
* Clojure (equal to Java - big surprise here)
* Node.js
* Go (almost equal to Node.js)
* Ruby (far far down)
Scala took the lead with amazing results. More over, a good metric was latency which Scala was the only one to take micro-second resolution.
I'm not a fan of Scala because of its surrounding tools, which is why I'm still considering going for either Clojure or Node.js.
I think the most surprising positively was Clojure, being that it is a dynamic language. And most surprising negatively was Go - by itself is impressive, but when given real work (Web handling, Redis/mongodb) goes bad quickly. Happy to see this correlates with your findings too; I'm assuming this is a symptom of library maturity..?
I'd be happy to see how Scala fares on your tests.
You've done an awesome job!
This started out as a small exercise, that quickly ballooned because we were curious about every framework and platform. Obviously we had to stop somewhere, but we're very interested in adding more tests in the future. In fact, we're hoping the community will help us out as well!
I'm working with a setup like this and just love it!
Given the amount of interest in Haskell and Yesod around here, it is strange that it is missing.
"On the back-end, we use Java, Ruby, Python, .NET, PHP and others based on what makes sense balancing server performance, scalability, hosting costs, development efficiency, and your internal development team's capabilities."
That said, we'd love to hear what we did wrong in the Go tests so that we can fix those up.
We'll be posting follow ups as we've had a chance to go through all the recommended tweaks.
I see more people making uninformed "haskell sucks" posts than expressing interest in it.
>and Yesod
Really? Yesod is the anti-haskell haskell framework.
Theres a consistant, considerable gap between their "raw" benchmarks (things like netty, node, plain php, etc.) and frameworks hosted on those same platforms. I think this is something we should keep in mind when we're tuning performance-sensitive portions of APIs and the like. We may actually need to revisit our framework choice and implement selected portions outside of it (just like ruby developers sometimes write performance-critical pieces of gems in C etc.) or optimize the framework further.
I'd like to crunch these numbers further to get a "framework optimization index" which would be the percentage slowdown or ratio of performance between the host platform and the performance of the framework on top of it. I might do this later if I get a chance.
Do you mind breaking it down for me a bit please?
Cost in dollars or cost in hardware utilization or some other cost?
Either way good architectures usually optimize the high traffic or high CPU areas anyway away from a scripted language.
Thanks for the really informative post! Go seems to be a good balance as a high performance language without having to go back to my traumatic Java days.
As a rule I like to divide this world into "Featureless" and "Featurefull" products.
When you use Rails, you're aiming to pile up features. You want to react to Product managers, to users, you want to work fast and satisfy the needs of customers - or else you won't have anyone to build to.
In this reality, the fact that you're doing 20req/s is OK. In fact, I'm betting that even when you take Go or Node.js - pile up all of the infrastructure and features that exist in Rails, and pile up a ton of your code - buggy and not buggy - you'll get around the same kind of satisfaction index from users.
This is because your product can be perceived as slow even though your servers are blazingly fast.
On the other side of the spectrum there are "Featureless" products. These are infrastructural products. A logging service. An analytics service. A full-text search. A classification and recommendation engine.
These you don't want to build in Rails. I'm sure you haven't even considered it. These you want to build with one of the top-notch libraries that this survey indicate.
I agree, a remarkable take-away for us was how dramatically our i7s excelled over the EC2 instances. Admittedly, those were EC2 Large and not Extra Large instances.
A previous draft of this blog entry had a long-winded analysis of hosting costs--discussing the balance between ease and peace-of-mind provided by something like AWS versus the raw performance of owned hardware--but we elected to remove that since it wasn't really the point of the exercise.
We're actually very interested in how the large/newer instances perform.
30-50x performance difference gets really... real, no? The standard refrain of "throw more hardware at it" must reconcile with the fact that a factor of 30-50x means real dollars for the same amount of load. Is the developer productivity really that much greater?
I fully respect the JVM family of languages as well. I just think that Mark Twain said it best when he said: "There are three kinds of lies: lies, damned lies, and statistics." It's not that the numbers aren't true, it's that they may not matter as much, and in the way, that we initially perceive them.
Performance is certainly something you should consider when selecting a language/framework, but it is not the only thing.
========================
You should undertake a detailed examination of these statistics before making any decisions.
Issue #1) The 30-50x performance difference only exists in a very limited scenario that you're unlikely to encounter in the real world.
Look carefully at the tests performed. The first test is an extraordinarily simple operation: take this string, serialize it, and send it to the client. This is the test in which we see massive differences:
Gemini vs Rails
25,264/687 (gemini/rails-ruby) = 36.774
25,264/871 (gemini/rails-jruby) = 29.000
Node.js vs Rails
10,541/687 (nodejs/rails-ruby) = 15.343
10,541/871 (nodejs/rails-jruby) = 12.102
That's a 37x performance win for Gemini, and 15x for Node.js.
Side note: You might be wondering why I didn't compare to the top performer, Netty. Netty is more like Rack. You build frameworks on top of Netty, not with Netty. As a Ruby dev, you could think of this in the same context of comparing Ruby on Rails with Rack; not a good comparison. Hence We won't compare Rails to Netty.
The error would be in extrapolating that a move to Gemeni or Node.js would give you a 37x or 15x performance increase in your application. To understand why this is an error, we jump down to the "Database access test (multiple queries)" benchmark.
Issue #2) Performance differences for one task doesn't always correlate proportionally with performance differences for all tasks.
In the multi-query database access test, we start to see the top JSON performers slow down significantly when compared to the slow down for Rails:
Gemini vs Rails
663/89 (gemini/rails-ruby) = 7.449
663/108 (gemini/rails-jruby) = 6.138
Node.js vs Rails
116/108 (nodejs-mysql-raw/rails-ruby) = 1.077
60/108 (nodejs-mysql/rails-ruby) = 0.555
In this scenario -- which is arguably much closer to the real world -- Ruby on Rails closes the gap and even beats some of the hip new kids.
But why? The in-depth answer to this question would require a lot of space, but the really, really short version is kind of a "what's the sound of one hand clapping" response: Ruby isn't actually all that slow.
To understand what the hell that means, check out this presentation from Alex Gaynor (of rdio/Topaz fame):
https://speakerdeck.com/alex/why-python-ruby-and-javascript-...
Ruby is just about as fast as C, provided you're comparing it to C that does exactly the same operations on the hardware as the Ruby code. Don't get me wrong, that's a HUGE provision. But it warrants close examination.
The real benefit of lower level languages like C is that they give you the flexibility to drill down in to your actual bare-metal operations and optimize the way the program executes on the hardware. As Alex points out, we don't currently have that level of flexibility in languages like Ruby (without dropping down to inline C), so we suffer a performance penalty.
This penalty is huge for simple tasks because they involve only a handful of operations that execute extremely quickly. As you add complexity, however, the benefits of micro-optimizations get lost in the vastness of the overall execution time.
Look at it like this. When Gemini hits 36,717 req/s in the JSON test, each request only lasts about 1.6 ms. This is only possible because of the simplicity of the operations being done on the hardware. Ruby loses big here because there is a lower boundary to the way you can optimize without dropping down to C.
gemini: 1.6 ms per request
rails-ruby: 87.3 ms per request
When we look at the multi-query database access test, we can see how the optimization at the low level gets lost in the sea of time taken to process the request.
gemini: 90.5 ms per request
rails-ruby: 674.2 ms per request
Granted, that is still over a 7x performance win for Gemini, but this is where the Ruby arguments about programmer efficiency come in to play. I don't know Gemini, so it may very well beat Rails in that comparison too. Ruby is getting more performant with every release though, so it's easier to justify on the basis of preference alone when we're this close.
seriously, in my rough helloworld and sqlite value increment by 1 benchmark, Bjoern+wsgi app runs 2x as fast than nodejs.
"Sadly Django provides no connection pooling and in fact closes and re-opens a connection for every request. All the other tests use pooling."
But it's free, open-source software and we provide asynchronous database connection pooling for Postgres SQL:
For one person do a benchmark over this many samples, they have to just go with the out of the box setup for each.
Also, the difference between EC2 and local i7 hardware is glaringly obvious. At what scale does owning the server hardware become imperative?
I know these questions are beyond the scope of a performance review, but inquiring minds would like to know.
Some things are really difficult to answer in a vacuum. If you already have a competent devops staff, hosting your own hardware is probably beneficial. The increased performance per "server" is substantial. But no devops staff? Then it's either very risky or cost-prohibitive to own hardware.
We posted the deployment approach for each framework to the Github page.
I am completely stunned by the performance cost of using ORM/AR, and will be using this to shame our team lead into giving it up and going for raw queries.
php - https://github.com/TechEmpower/FrameworkBenchmarks/blob/mast...
php-raw - https://github.com/TechEmpower/FrameworkBenchmarks/blob/mast...
For "php", they used an ORM. For php-raw they used the stdlib (pdo).
I'd also like to see a "light" framework meant for building APIs, like Slim, for instance.
We'd also love to have anyone who has production experience with those to contribute a test for them. The test should be fairly quick to write.
Oh really? Then why did Zed write such an angry rant about how you are doing it wrong?
http://zedshaw.com/essays/programmer_stats.html
Can we please see some standard deviations, at least?
I would love to see php-fpm on nginx included in this test.
ServiceStack 9615ms
WebApi 30607ms
GitHub project for benchmarks used: https://github.com/anilmujagic/ServiceBenchmarkBut don't quote me on that! :)
I disagree with the implication here (that this is a good point for comparison because "real-world application's performance can only get worse."). Yes it can only get worse but how much worse (per unit of "features") is both significant and unaddressed.
This isn't the best example but look at the gap between the top and bottom of the scale in the Database access test (single query) and Database access test (multiple queries) charts: In the first, Gemini is ~340x faster than Cake, in the second, only ~23x faster. There is still a big gap but it closed by an order of magnitude once you stepped past the most trivial possible DB access test.
So nodejs or php-raw is faster than cake at a single DB access, but what about when you create a real world scenario with authentication, requirement to be able to update features faster (i.e. use an ORM), env. portability requirement, etc.? It seems to me this would look like a little slower, a little slower, a little slower in the {raw} versions, and already included, already included, already included in Rails or Cake. The full featured frameworks take a lot of their performance penalty up-front, with less of a hit as features are added (maybe? :P).
My point is that it's not reasonable to assume that hackernews-benchmarks will actually reflect production use. That said I think the article is cool, and agree that it's good to keep framework authors' feet to the fire regarding performance!
We know that there is a set of common features and the benchmarks goal is to test least common denominator stuff on the networks. Authentication and portability are not LCD. The argument that they are is capricious. What if we made the requirement be that the framework is a lisp? Now we've completely changed the intent.
And also maintains .NET's fastest JSON and Text Serializers: http://theburningmonk.com/2011/11/performance-test-json-seri...
Either way, I'm shocked to see Play perform so slow comparatively. Although it's easily 10x faster than rails on most tests, I'm shocked to see Node.js faster than Play! (by 2x in most cases) Wow!!
Maybe Node.js critics should start appreciating it after all..
Disclaimer: I am a colleague of the author of the linked article.
Seriously though, this isn't news to anyone that does this professionally. The further up the abstraction curve you climb, the less performant the code will be. Ease of development vs run-time performance.
Most folks who have run Rails at scale, for example, find that the untuned garbage collector in MRI (Ruby's default interpreter) introduces a large amount of variance, for example.
(Note that the very first data panel in the intro is an image and doesn't have tabs.)
It runs pretty well, scoring similar to webgo for the JSON pong benchmark, is almost at the same level as grails for the 1 query benchmark and is slightly faster that Play for the last benchmark.
So, Yesod is in the same performance gap as Play or Grails, and is 3~4x faster that Django or Rails.
But I've tested those on a mono-core Virtual Box, and I know Yesod scales pretty well on a multi-threaded environment.
Also, keep in mind that most of the top performing frameworks are not fully featured web framework but asynchronous I/O libraries (netty, go, nodejs, vertx, ...) which implementations just write mindlessly the raw response directly on the socket whatever the HTTP request was.
[1] https://github.com/TechEmpower/FrameworkBenchmarks/pull/39
If you want to help, check out this pull request: https://github.com/TechEmpower/FrameworkBenchmarks/pull/14/f...
Anyway for me its more important speed of development than performance on the server. Maybe my servers do not get that many visits.
Pull request, maybe? :)
>We attempted to take advantage of threading/multiple-processes as best we could (see the nodejs code for an example of using the cluster module). But we suspect there are additional areas of improvement here.
What do you think, bhauer? :)
Maybe a few elaborate scenarios are posted and people can simply submit their best setup/config/code to be benchmarked. I imagine devs would improve on it over time and eventually the most optimized would surface?
It might be more of a real world test to include mongoose with the node+express test, but for the node-only test the native driver might be more appropriate.
For instance, if you're running on EC2 hardware (which is common enough) and you're executing ~20 DB queries/request (which is probably, unfortunately, common enough), the difference between Java servlets and Rails is more like 10ms.
Then what happens when more than 89 real-world users start to hit your Rails server each second?
(Note: I really like Ruby and Rails. Much more so than Java and its offerings.)
http://vertxproject.wordpress.com/2012/05/09/vert-x-vs-node-...
"We've got Zend as a next target for PHP frameworks."
2) The JSON serializer in Django 1.4 uses a method which is known to be very slow, but which is easily portable across different platforms and works with older versions of Python. They no doubt included for easy bundling. In a real application you would probably want to simply use the normal JSON serializer from the standard library (which is many times faster).
3) The examples are little more than "hello world". I did some benchmark tests with several Python async frameworks, Pypy, and Node.js for an application I was working on. With small JSON objects there wasn't much difference in performance. Once you started using large JSON objects the performance lines for all versions were indistinguishable from each other. The performance bottlenecks were in libraries, and those standard libraries were all written in 'C', so interpreter versus compiler versus JIT made little difference.
4) The problem with "toy" examples is that in real life there are two performance factors which must be taken into account. Think of as y = mx + b. With a toy example you are probably only measuring "b". With most real life applications it's "m" that matters. There are often different optimization approaches that are best for varying ratios of "b" and "m". You have to know your application intimately and benchmark using data which is realistic for that application.
Python has a reputation for being "easy to learn". However, it is "easy" in the sense of being able to hack something together that works without knowing very much. There can be several different ways of doing things and doing it one way versus another way can mean a difference in performance of several orders of magnitude. The same may be true for some of the other languages, but I haven't examined them in enough detail to say.
I think you should tweak your tests to use more real world like examples. I realize it would be hard to do this across frameworks.
Like let's have a database query pull user record from 100,000 users by username. And maybe do md5 on password.
And to on top of this compare database requests is meaningless as the blocking nature of the framework itself is the major bottleneck, and not the database.
Now, for a normal web application, the largest amount of request would come from static content, or cached content within the web app so the real gain would be a tiny fraction between python/ruby and go/java-based frameworks.
That said, if you want to handle static content (images and such) from within your app, or build a java script-centric application with lots of tiny requests, or even persistent once...nor rails or django would do.
1) Add a raw http test -- no template compilation or html, just return "OK". Would give a relative idea of the cost to "just turn the thing on"
2) Don't json encode after the database tests. By json encoding you're doing two things but saying you're only testing one.
I come from the python/django world, and I know that different python json packages have orders of magnitude differences in their performance. From that I can infer that there are probably similar or greater cross-language differences -- I suspect node.js having json as a native object type helps immensely.
Something about that phrase "Let us simply draw the curtain of charity..." resulted in an immediate spittake. Coffee everywhere.
Thanks again! :)
http://www.techempower.com/blog/2013/03/28/framework-benchma...
(Note the initial chart in the introduction section is an image and it won't be affected.)
For instance, in the Express example code, they're sending JS objects rather than serializing them to raw data that the socket can just send. Instead, serialization/copying are happening on each request, which is a significant overhead.
I don't have domain knowledge over many others, but I suspect a similar problem might exist with others.
The netty code creates the ObjectMapper once and uses it for all requests, whereas golang code creates the json encoder for every request (enc := json.NewEncoder(w)). Just getting rid of that would make this trivial code so much faster.
If you seriously thought that the difference between a netty and django would be 4 times then you simply don't understand what these frameworks do in the first place.
I would have guessed something like less than 100 times slower and that would have been still fine since the cost of all kinds of latencies in the system usually far outweigh the speed of the framework itself.
Useful comparison anyway. Seems Go struck a good balance.
a) Please, publish not only the requests per second, but the memory and CPU usage of the machine for each framework. b) For JAVA systems, can you publish de heap configuration of the JVM?
Cheers!
You should show us the cpu usage. If you had done so, you would have found how absurd the mistake you had make.
(Disclaimer: I'm the Vert.x project lead)
a) bench on a single core processor (a small EC2 instance)
b) configure node.js as a cluster with as many instances as processors.
How the hell can you compare accessing a MySQL database to accessing a MongoDB database?
Its like comparing apples to piles of poop.
Also when you're testing things like Django in web requests, you're testing gunicorn, not Django.