Perhaps my head has been stuck in javascript/node land for too long but I think accusations about javascript producing callback hell now seem a bit disingenuous even for relative novices to the language.
It's 2016 and there are many well documented and widely adopted solutions arising from external libraries and developments in ECMAScript. Thanks to transpilers like Babel/Typescript we can even shoehorn these new ECMAScript features into older browsers.
"Solutions" is a strong word for what's available in JavaScript. Promise hell isn't better than callback hell, it's just horrible in slightly different situations. To make matters worse, it seems that many JS libraries have just wrapped the old callbacks in promises, meaning that we end up using promises in situations where a callback would actually be easier (because that's how it was originally written).
None of this comes close to the ease of threads in Erlang.
So the lexical structure of my code was linear - while the runtime structure was nested at arbitrary levels. There never was a reason to represent the nested runtime structure in the written code.
I also didn't attempt to use node.js for things it wasn't made for, like compute-intensive tasks or implementing business logic. The good old chat server for ten thousand people was an often used example for node.js programming for a reason - lots of I/O, little processing.
Note: I don't write code like that any more in ES 2015. I also don't use classes, prototype, this, bind, apply - only functions and (lexical) scope (with an eye on capturing only as much scope as I need). Which is the opposite of the above described method where lexical scope was not usable, but with the methods available now the code still is "flat", so that's why I switched.
In the article we have someone promoting the use of a functional language on a functional programming site promoting the use of thread-lexical mutable variables to store state across procedure calls. The alternative the author dismisses is to swallow a return as an input. Something's not quite right here.
If you're doing something in Node that takes nontrivial processing time (that doesn't just involve waiting for I/O), you're Doing It Wrong (tm).
Since the premise of "callback hell" is outdated, the only remaining point is that if you perform complex/time consuming calculations in NodeJS, it slows down.
If you want the most speed, or if you're going to be doing something like a Fibonacci calculation, you should write that (microservice) in Go instead. It's faster from the start (by 2x at least, probably a lot more if you're CPU bound), and GoRoutines can be spread across multiple threads, so a multi-CPU host can take advantage of all of its CPUs.
Oh, and the per-GoRoutine overhead is only about 4K on a Linux host, last I checked. True that Haskell has a smaller overhead per thread, but considering the speed advantage, I think the Go server would still win big on latency for the number of threads that did actually fit in memory.
Running performance tests with a single Node process doesn't feel like it's giving Node a fair chance to perform up to the capabilities of the machine.
There are good reasons we went to thread based models - developer productivity and safety. Event loops are fine for toy demo's, or very carefully managed products (trading systems, NGINX) but not for use as general purpose hammers.
Every single bit of extra friction and cognitive overhead costs you dearly a few years down the line. We scrambled away from this stuff as soon as we could, and there's no good reason to go back.
"Callback hell" is IMO a terrible way of doing event-loop systems, as the nesting can get confusing. For implicit event loops like Node, I strongly prefer the Promises approach (preferably with async/await sugar); for explicit event loops (I only have practical experience with Arduino, though I've also a passing acquaintance with the classic 69k Mac as well) I like to build a set of event-driven state machines with cooperative multitasking. If properly designed, these keep everything nicely separated so you can follow the logic without any trouble, while also avoiding the concurrency concerns of a threaded system.
I greatly prefer first class models for where things queue up, and then to use producer/consumer objects against those queues.
What pain awaits me?
EDIT: Just to make it clear early on, I agree with the article's conclusion that Nodejs is not as good at compute heavy workloads as Haskell. I simply object to any use of "the nested callback problem" as valid in 2016. It's an issue exclusively for legacy code and developers who take pride in writing outdated code.
It seems only fair, then that I also should defend Javascript from people obviously unaware of the state of the art in pseudo-imperative programming. And by state of the art, I mean "has been around in some languages for 3+ years."
The example:
request('http://example.com/random-number', function(error, response1, body) {
request('http://example.com/random-number', function(error, response2, body) {
request('http://example.com/random-number', function(error, response3, body) {
...
});
});
});
But modern Javascript (before you start, yes, it runs on every browser with preprocessing, which is normal for this ecosystem) would make it look more like this: // rp is a request promise, multiple options for creating them
async function make3StaticRequests() {
try {
var res1 = await rp('http://example.com/random-number')
var res2 = await rp('http://example.com/random-number')
var res3 = await rp('http://example.com/random-number')
// ...
}
catch(error) {
// ...
}
}
// And of course the promise library allows for many things
// you'd like with applicative functors, like binding groups
// of operations together and evaluating them all.
function randomNumberPromise() {
rp('http://example.com/random-number')
}
async function make3StaticRequests() {
var [res1, res2, res3] = Promise.all([randomNumberPromise(),
randomNumberPromise(),
randomNumberPromise()])
// ...
}
I don't really understand why people feel comfortable writing up comparison articles without doing sufficient research into what they're comparing things to.That said, the articles point about large compute workloads starving other operations is very much true and a good example of what the weakness of V8 as a server-side programming environment brings.
While Haskell is truly better at concurrency (no need to serialise when passing messages, green threads yield not only at IO but also at memory allocations), that part of the comparison isn't very good. Spawning a cluster of NUMCORES threads using the built in cluster module would be an improvement.
Nodejs has a poor story for compute-intensive loads. People need to be comfortable saying that, because it's reality.
That said, I agree that the article should at least mention the likes of `await`...
Begrudging Javascript 1 or 2 seems somewhat petty given how common it is to see compiler extension preambles at the top of every haskell module.
First, using clustering and memoization would improve the throughput a lot. I did something similar when adapting a JS based script library to be used in node, because I knew it would lock the main loop otherwise. Beyond this, cpu intensive work should be avoided in your service loop regardless. It's best distributed to an RPC/Worker pool.
In terms of scale, node scales as well or better than a lot of frameworks, it's only that you will usually want to use similar techniques locally as well as remote.
Another poor example is when you need millions of references in a single thread, Node will die spectacularly. That doesn't mean it shouldn't be used for many use cases, it only means that it's bad at some of them.
I find that node is great as an intermediate/translation layer... your UI talks directly to node, tightly coupled.. then node can translate against backend databases or other services as a gatekeeper for your front end. It allows you to make the data the shape that is most convenient, with the least amount of disconnect of thought and approach.
It's also pretty great for certain types of orchestration control and even in the proof of concept stages of applications. Doing a first version of almost anything I've tried in Node is usually much faster than alternative platforms. And often performs well enough to stick with it. Developer productivity is more important than absolute scale at the beginning, and if you have a plan to scale horizontally, you can do that for a while before you need to break off other optimizations.
Do you say this because of the libraries available for it? (curious, wanting to jump to node).
It's just so much faster to get going if you're developing the full stack, and already in JS heavy land anyway. Not having to context switch for the backend is huge... being able to use a document/object/json database isn't as big of a boost but still nice. On the db side, I've been using the template wrappers so that I can write a simple query and it turns it into a parameterized query returning a promise.
async function getRecords(baz) {
return await sql.query`
SELECT
a,
b
FROM
foo
WHERE
foo.bar = ${baz}
`;
}
So, I still have to think about some SQL, but still usually better than trying to twist ORMs into shape.Overall, for the past 6 years or so (since 0.8) I've been using Node pretty heavily (moving from more C# on the backend) and really haven't missed it at all. Even though the core .net and more open-source stuff has been interesting... Managed to dockerize a few trivial .Net apps using the dotnet onbuild base containers.
If you're using windows, the only gotchas are you need a C++ build environment (Visual C++ 2015 Build Tools, checking all options) and Python 2.7.x in order to build any binary modules... most of which now run without issue on windows, was a much bigger problem in 0.8-0.10 ...
1) The Author knows little about JS.
2) Picking on a scripting language which has 2 weeks + 2 years of development vs a 25 year old monster that is the playground of the brightest minds in CS, is easy.
Unfortunately the article did also not go into details of how the event loop works or shows how you can break out of the single thread with the tools that ARE available to you. I guess the idea was to write a hype article for Haskell. Here are some more ideas for the author:
Comparing Haskell and C++ type systems.
Comparing Haskell and Clojure functional purity and lazy evaluation.
Comparing Haskell and Java deterministic parallelism.
Anyone who understands JavaScript can see that the recursion invoked in the slow route is not asynchronous (each recursive invocation keeps piling onto the call stack without ever releasing it)! You'd have to use process.nextTick (or setTimeout) if you wanted to recurse asynchronously without spawning a new process...
For these kinds of unusual, heavy computations, though, you'd be better off using the child_process module to spawn a new process and do the recursion inside that process so that it doesn't block the main event loop.
This has nothing to do with starvation. Node.js just has a completely different approach to this kind of problem.
GHC (the leading Haskell compiler) provides an extremely advanced green threading system with the runtime. http://haskell.cs.yale.edu/wp-content/uploads/2013/08/hask03...
> I like the way it's done in Node.js - Explicitly.
If you wanted to do it this way in Haskell, there are a number of monads that make it way more convenient than doing it in Node. Cont in particular can be used for cooperative concurrency. No one really uses these outside of niche cases, however, because threads are almost always a better abstraction.
But I do a lot of REST API (and WebSocket) work and all of the workload that happens inside my Node.js program is extremely lightweight.
If I need to perform some heavy computation, I will offload it to a separate child process - Node.js forces me to put that code in a separate file/module but I actually like this because it encourages separation of concerns. It feels very natural so I don't really need any other special constructs.
I looked into goroutines a while ago; it looks cool, but I probably wouldn't use them much because I don't like the idea of having code from the same source file splitting off into multiple processes/threads; it makes is harder to read and reason about the code (this is a bit like what happens with multi-threaded code when you have mutexes all over the place).
To me, this feature has the same utility value as the ability to define multiple classes per source file - Ok, that's cool, but is it a good idea to do that?
Tcl was doing event loops quite successfully in the late 90ies and was fairly popular, back in the day.
These days, I'd use Erlang (Elixir).
This is kind of ranty, but also makes me laugh: https://www.youtube.com/watch?v=bzkRVzciAZg
I don't know if you can claim definitively that Node introduced more people to event loops than any other technology, but it certainly is one major popularizer of the idea, and the one that's had the most impact in the past decade.
> Looking near the top of the output, we see that Haskell's run-time system was able to create 100,000 threads while only using 165 megabytes of memory. We are roughly consuming 1.65 kilobytes per thread.
Those are not the same kind of threads that the author is talking about in the beginning of the article. Those a green threads and as such are multiplexed to much smaller amount of real system threads to do work in parallel. What that means is that they, for example, can't all make a system call at the same time. Go has the same issue.
Yes, they can. The Haskell IO manager knows how to handle this. http://haskell.cs.yale.edu/wp-content/uploads/2013/08/hask03...
As a WebLogic developer we fixed this in the late 90s but the Volano chat benchmark was still run for no apparent reason. Somehow everyone else didn't get the message until Netty was released and people started using it.
Obviously what you want is multi-threaded execution with asynchronous I/O. Using node on multi-core systems just doesn't make a lot of sense as you end up having to duplicate your entire program on each core to get the full performance of the machine. Not unlike 512k/thread but much worse — especially if you cache anything locally in the process like template compilation, etc.
Don't get me wrong, the fact that Node doesn't have a more efficient way of handling overhead on multicore machines is a drawback. But you pay for your abstractions. Nobody is going to argue that Node is a phenomenal solution for lightweight, multi-threaded execution. But most companies using Node seem to accept this and are fine with not utilizing 100% of their resources (hell, most don't even seem to know Node can spawn processes, in my experience).
If they care they use a framework that meets their needs.
node is probably not the best choice for truly CPU bound operations, but you can sometimes get by using the native cluster module to spread work over multiple cores.
https://hn.algolia.com/?query=why%20events%20are%20a%20bad%2...
https://hn.algolia.com/?query=why%20threads%20are%20a%20bad%...
https://hn.algolia.com/?query=threads%20vs%20events&sort=byP...
What would I call "heavy work"? e.g: compression, serialization, encryption, image processing... tasks that are bound by CPU and not only I/O. Usually you want to delegate that to a native module and not do that yourself in JavaScript. If you absolutely have to do it in JavaScript, then you need to make sure the task is not blocking the event loop. In order to play nicer with the event loop you inject something like setImmediate or process.nextTick after certain amount of time or iterations... otherwise you will starve other tasks in the loop, notably, I/O.
node is also not a really good idea if you need a lot of interprocess communication.
It is a very viable alternative, though.
https://github.com/spion/fpco-article-examples
Its an interesting benchmark, but it needs more work to give a more accurate picture. It would be nice if it:
* used wrk to measure requests and latency percentiles
* percentage of "slow" requests was tweakable
* number of workers per core was tweakable
Then we could generate a nice chart that shows latency percentiles as fn of % of slow requests and workers per core, and compare with Haskell.giggles uncontrollably
I like Haskell, but saying that it doesn't impose a design burden is incredibly misleading.
With this statement, the author acknowledges that the Node.js code in the slow route was not asynchronous. The test is therefore invalid; it's comparing apples and oranges.
Node.js is more than capable of handling different requests asynchronously (regardless of whether they are fast or slow); if you have any kind of blocking or waiting around happening; then you're doing it wrong.
I'm so tired of all the anti-Node.js propaganda; it's hurting people. If I walk into one more company where some zombie tells me that they're migrating away from Node.js because "the Node.js event loop starves the CPU", I'm going to have a stroke.
In reality all Node.js 'starvation' problems can be solved with the 'cluster' module or the 'child_process' module.
Since we've been talking a lot about 'Fake news' on Facebook recently. Maybe we should start talking about how fake news is affecting Hacker News. This anti-Node.js strain is particularly virulent.
What might be more interesting:
- what is the salary difference between a Node dev vs. a Haskeller?
- is there a productivity difference? does the salary over- or under-compensate?
- is correctness a core business concern?
- if I need to hire 10 devs, can I do that?
Go watch the node.js presentation Ryan Dahl gave at JsConf 2009, he addresses this during that speech.
His opinion on the "right way" to do concurrency is a little polarizing, but, quote: "the right way to do concurrency is to use a single thread and have an event loop. this requires that what you 'do' outside of IO waits not take very long".