Traditional Web Server:
The pizza shop receives a call for the initial order and starts the pie. Then the customer calls back periodically to check if the pie is done because the pizza shop cannot call back or deliver.
In a traditional web server, the person making pizzas only makes one at a time and can't do anything else until that pizza is done and delivered. In an evented model, each pizza maker makes many pizzas at once, and just watches for events (the oven timer goes off) and acts on them. (takes the pizza out) Sometimes if a task is taking him too long, he'll ask an assistant to work on it so that he can get back to making pies.
The hardware is doing fundamentally the same thing either way, but in the threaded model, it's also doing a lot of other stuff that you probably don't care about.
So, to explain it to my grandma: It's just a simpler way to think about it. There isn't really a big difference.
No, the advantage of doing it in userspace code is that you can paper over the deficiencies of the layers under the one you are working in with significant manual effort. Node.js introduced the concurrency problems when it selected Javascript as one of the layers, it doesn't get much credit in my mind for then solving them at great effort and with horrible damage done to the resulting program structures. The problems Node.js solves are not fundamental to programming, they are fundamental to Javascript.
Pick something like Erlang and the problem never exists in the first place. You don't have to paper over the deficiencies in the lower levels, because the levels below the code you're writing aren't deficient for concurrency in the first place.
I should point out that in general this is not necessarily a bad thing; alas, there's always some way your lower layers are deficient, and it's far worse when they make it impossible to paper over the problem. Still, you will never end up with simpler code. More specialized, oh my yes, but certainly not simpler. And the wisdom of picking a layer that is fundamentally deficient for your core target problem then papering over it seems pretty limited to me.
Now, be fair: Erlang papers over the deficiencies of the lower layers (namely, the kernel) with significant manual effort. It's just that someone else has already gone to this effort. BEAM is an event-based server behind the scenes, with lots of syntactic sugar to make it look multithreaded. The PLT Scheme webserver is another example of this.
I also didn't mean to suggest that the code itself is necessarily simpler. It can be, but (as the node.js example proves) it is certainly not always. The scheduling algorithm, on the other hand, is typically much simpler. Namely, it's typically "round-robin cooperative multitasking." No serious operating system since Windows for Workgroups has actually tried to use round-robin cooperative multitasking.
All the other "thread stuff" is typically simplified as well: smaller stacks (if any at all), simpler context-switching code, that sort of thing. The actual encoding of the business logic? That depends on the problem.
It's like a waiter at a restaurant- you tell it what you want and a few moments later, the waiter (or server) returns what you've ordered.
Good?
The waiter takes the customer's order and then waits for the food to be prepared before returning with the dish.
Asking the the waiter to ask the chef whether or not the fish is fresh is like making a head request.
Edit: and LDAP, and Redis, and medical PACS datastores, and RDBMS in all their multifaceted splendour... a freakish amount of programming boils down to CRUD.
Edit 2: and Gopher. I feel a little sad for forgetting about Gopher.
We've all been to the neighborhood coffee place where the girl will take your order, turn around and make your entire drink, hand it to you and ask for payment. We've all stood in that line.
We've also all been to Starbucks, where the girl takes your order, writes it on a cup, takes your money, then moves on to the next customer. And by the time you walk to the other end of the counter the guy in front of you already has coffee in his hand.
It still doesn't fit web servers exactly, but at least it fits the real world.
To me, it seems that no matter how you take the "messages" to do work, that work still has to be done. It surely doesn't magically use less resources because you told the OS that it could just call you back when it is done, as opposed to you having to hang around? Something has to be hanging around on one side or the other, and the "call back" takes resources as well, surely? It seems that you are just trading tit for tat. Maybe the reason is to not utilize some specific resource in the meantime?
Of course then Grandma, not being an idiot, will say "why not just have driver bring the pizza and not have the phone all tied up to begin with?" and she is absolutely correct. It is better to just set up a scenario where you have waiters, and customers show up a the shop, and in blocking your waiter doubles as the cook, so you need one waiter per meal... and so on. This analogy passes a slightly closer examination.
http://www.quora.com/Can-someone-explain-poll-epoll-in-Layma...
However, the pizza company can probably still only cook 256 pizzas at the same time (due to running out of pan-handles).
Writing event-driven applications is very prone to errors and invalid(impossible) states.