undefined | Better HN

0 pointsiMerNibor10y ago0 comments

I'd imagine encoding/decoding will be really insignificant compared to generating the requested page or fetching data from a database in most, if not all, cases

0 comments

7 comments · 1 top-level

geocar10y ago· 6 in thread

Most websites do not see more than 100 requests per second.

In those cases you are correct: parsing and de-parsing is insignificant compared to the amount of energy the computer is using to heat the room.

However in order to do a trillion requests per day you need around 30 machines using a custom web server, or 300 machines using Fastcgi: In this situation the cost is an order of magnitude.

jerf10y ago

Many people observe that as miles-per-gallon gets better and better, it begins to become a deceptive measurement in a way, because going from 10 to 20 mpg is a much, much larger change than going from 30 to 40, or even from 80 to 140. It seems people get a better sense of what's going on to measure gallons per mile. When you start doing that it becomes more clear that going from .0001 gallons per mile to .00001 gallons per mile, as large as it may be in orders of magnitude, still isn't that big a deal. Either way you're looking at your cost-of-fuel being effectively zero for all practical use cases, because your costs will be dominated by something else.

Similarly, I've noticed that people tend to get a little silly about web server requests-per-second. It really gets to the point you probably ought to be talking about seconds per request, or perhaps rather, microseconds per request or something.

Because A: as you start talking about these fast servers, you need to contemplate whether your code can run in, say, 2.5 microseconds either; who cares whether your webserver takes 2 or 25 microseconds to handle a minimal request if your minimal response requires 8 milliseconds (i.e. "8000 microseconds")? 8ms would actually be pretty decent performance for a wide variety of non-trivial web requests.

And B: As the webservers get faster and faster, you really need to start wondering what corners they cut to push their reqs/s number up. I can make a blazingly fast webserver that would actually kill nginx's performance stone dead for a "return a constant JSON string response" task... the trick is that I'm not even going to look at the incoming web request, I'm going to just receive a socket, blast out my answer as a constant string buffer without even reading from the socket, and discard the socket. (If you're feeling particularly saucy, hook that up to a user-space TCP stack so you can drop the work of properly setting up and tearing down TCP connections.) There aren't that many real-world tasks for which that is a good solution (though, non-zero!), but it'll look like pure awesomesauce on the benchmark!

Properly handling HTTP is non-trivial problem, and even moreso if it's going to be hooked up to a program rather than a static file system or something similarly easy. I actually start getting nervous about web servers that show excessively high numbers. If your performance is much better than nginx, rather than me cheering for joy, I actually have a lot of questions about how you did that exactly, and what my website's security profile looks like with your way-faster server. I'm not saying these questions are completely unanswerable; perhaps there is a way to safely do a much faster web server. I'm just saying that rather than my default response being celebration and "Oh wowzers cool!", my default reaction is a healthy dollop of skepticism.

rkeene210y ago

My web server, filed ( http://filed.rkeene.org/ ), is faster than nginx for serving static content by doing two things: 1. Only handling static files 2. Being extremely optimized for serving static content

It's very safe as far as I can tell having run it under AFL with no crashes with ASan on as well as having run it in production on the public Internet.

A few of the optimizations I do in "filed" could also be done in nginx, but most would cost too much.

A separate logging thread that is queued to helps a lot and was one of the main reasons for writing "filed" -- my ability to serve files was being slowed by my ability to write logs indicating that I had served something. The downside is that there may be a large queue of unwritten logs in the event of a kernel panic it other unexpected process termination.

Most requests don't even open the file they are serving because "filed" caches open file descriptors -- once the file has been opened it's kept open until cache entry is needed for a newer file.

There are no runtime allocations after startup except for log entries, leading to very consistent performance under loads.

JoeAltmaier10y ago

Re: gallons of gas. There's the old puzzle: your spouse gets 100MPG in that super-hybrid-mobile. The salesperson wants to upgrade you for $1000 to the super-duper-hybrid-mobile at 200MPG! Double the mileage!

You suggest instead that you get the old truck serviced and replace the plugs, distributor and tailpipe. Estimated cost $1000, and should get you from 10MPG to 11MPG. Which is the better deal? Assuming you both drive about 100 miles per week.

1 more reply

geocar10y ago

That is a very good point, but it's not the argument here: I'm responding specifically to the idea that webserver and webserver+fastcgi are "technically" the same speed.

Running two web servers (one speaking HTTP and one speaking FastCGI) is necessarily going to be slower than running one web server.

This should be obvious, although it might be "not significantly slower", which is why I provided some real numbers from my experience to show at which point it becomes slower by an order of magnitude.

You might also find that it's easier to debug one webserver than two.

1 more reply

mhd10y ago

If you managed to get a trillion requests per day with a "hello world" scenario, you're probably also able to get 270 machines for free from your gullible incubator.

geocar10y ago

Most ad servers deliver static or hello-world-style content, doing no database lookups, but logging their results.

RTB systems have about 30-100msec for the entire transaction (and that includes network to the user), so you need better control of your latency anyway.

j / k navigate · click thread line to collapse

0 comments

7 comments · 1 top-level

geocar10y ago· 6 in thread

Most websites do not see more than 100 requests per second.

In those cases you are correct: parsing and de-parsing is insignificant compared to the amount of energy the computer is using to heat the room.

However in order to do a trillion requests per day you need around 30 machines using a custom web server, or 300 machines using Fastcgi: In this situation the cost is an order of magnitude.

jerf10y ago

rkeene210y ago

It's very safe as far as I can tell having run it under AFL with no crashes with ASan on as well as having run it in production on the public Internet.

A few of the optimizations I do in "filed" could also be done in nginx, but most would cost too much.

Most requests don't even open the file they are serving because "filed" caches open file descriptors -- once the file has been opened it's kept open until cache entry is needed for a newer file.

There are no runtime allocations after startup except for log entries, leading to very consistent performance under loads.

JoeAltmaier10y ago

1 more reply

geocar10y ago

That is a very good point, but it's not the argument here: I'm responding specifically to the idea that webserver and webserver+fastcgi are "technically" the same speed.

Running two web servers (one speaking HTTP and one speaking FastCGI) is necessarily going to be slower than running one web server.

You might also find that it's easier to debug one webserver than two.

1 more reply

mhd10y ago

If you managed to get a trillion requests per day with a "hello world" scenario, you're probably also able to get 270 machines for free from your gullible incubator.

geocar10y ago

Most ad servers deliver static or hello-world-style content, doing no database lookups, but logging their results.

RTB systems have about 30-100msec for the entire transaction (and that includes network to the user), so you need better control of your latency anyway.

j / k navigate · click thread line to collapse