An open question (rant) about Node.js (opens in new tab)

(gist.github.com)

93 pointsjobeirne12y ago116 comments

116 comments

80 comments · 28 top-level

ef412y ago· 19 in thread

It's worse than that: it's true that in-process asynchronicity is a performance optimization, but it doesn't follow at all that the code needs to read asynchronously.

There is no performance reason to make developers hand-code in the continuation-passing style. That should be the job of a compiler or runtime environment.

Erlang does it. Stackless Python does it. Go does it.

The whole "look how fast Node is due to all my hand-rolled callbacks" is a total fallacy. It's a case of taking perverse pride in an accidental wart of your language.

Slowly the node community is maturing and beginning to realize that if they want to build robust, fault-tolerant software they need better primitives than hand-rolled callbacks. Hence the slowly rising mindshare of Promises, which are strictly better, but would be better still if the language itself baked them in implicitly (consider that nearly any value in a Haskell program is represented by a promise under the hood, but you never have to deal with them explicitly).

jeswin12y ago

There is no performance reason to make developers hand-code in the continuation-passing style.

node certainly recognizes this. However, there is no need to do anything here since JS is taking care of this with ES6. Even if you don't want to use ES6, there are now multiple ways to compile it down to ES5.

Hence the slowly rising mindshare of Promises

Promises are declining, and honestly doesn't really solve the problem in most cases. A couple of years from now, you won't see promises on the server. You should see tjholowaychuk/visionmedia's co[1]; that's how JS is going to look like in future.

In terms of expressiveness, there isn't much to choose between dynamic languages. While it took some getting used to, programming JS wasn't that different from writing Python. However, JS/node is succeeding because this is the first time ever that we have a language/platform that is truly write-once, run-anywhere. A significant advantage in favor of node.

[1] https://github.com/visionmedia/co

ef412y ago

Yes, I'm looking forward to ES6. And I agree that Node's strength is that Javascript is truly the lingua franca of the modern web, which I why I would even bother using it.

But the value of running everywhere is also why ES6 is not a panacea. As you say, soon we won't need promises "on the server". But I want my codebase to run correctly everywhere. Promises are a bandaid that I'll need for quite a while yet.

1 more reply

babby12y ago

>A couple of years from now, you won't see promises on the server. You should see tjholowaychuk/visionmedia's co[1]

Dont you mean you won't see standalone Promises, and instead Promises + Generators?

angersock12y ago

The linked page doesn't really convince me.

What do we get using this new shiny instead of the older shiny bluebird or the old shiny Q?

masklinn12y ago

> Promises are declining, and honestly doesn't really solve the problem in most cases. A couple of years from now, you won't see promises on the server. You should see tjholowaychuk/visionmedia's co[1]; that's how JS is going to look like in future.

I'm impressed that you apparently completely failed to realise co is an abstractive layer over promises, something it states upfront:

> Generator based flow-control goodness for nodejs (and soon the browser), using thunks or promises

What you yield from your generator is not magical pixie dusts, it's promises.

2 more replies

bascule12y ago

First: reactor-based event loops aren't always a performance optimization. Round tripping I/O operations through an event loop adds latency. Node's approach is great if you have large numbers of mostly idle connections, such as in a chat or websocket server, but slower than a multithreaded, blocking approach when you have small numbers of active connections. What you really want is a system that supports both models, so you can choose the best one for your problem.

Some Node developers, for example the ones behind Meteor, have come to the conclusion that callback/promise-driven development is painful and tedious, and started wrapping up the asynchronous behavior in coroutines.

Unfortunately, to participate in this sort of world, where you wrap the asynchronous spaghetti into what the "Node.js Is Badass Rockstar Tech" video would call "sequential code, you know, the code you can read", each async library must be hand-plummed into a synchronous version that does the coroutine juggling.

This puts Node in exactly the situation it was trying to escape: now there are two different I/O models, one synchronous, one asynchronous, and not all libraries support the synchronous model, so by trying to leverage it, you're cutting yourself out of large parts of the Node ecosystem.

Better environments abstract over the I/O scheduling, letting you choose between threaded, blocking I/O, and a M:N task scheduler which can handle many lightweight tasks being scheduled on native threads. Rust comes to mind.

rdtsc12y ago

I wrote a post on this in a Go topic here not too long ago:

https://news.ycombinator.com/item?id=7388790

It was about how essentially, in a large concurrent system, a chain of callbacks like:

cb1 -> cb2 -> cb3|eb3 then cb3->cb4 and eb3->cb5

(select/epoll returns and calls cb1, chain ends with cb4 or cb5 depending if errback eb3 is called).

Is just a messy, dangerous and confusing re-implementation of threads/goroutines/tasks. Besides spreading the business logic among multiple io related function or sprinkling yields() or thens() it doesn't completely save one from needing locks and semaphores if shared or non-local data is modified.

A second callback chain from cb1 could have started before the previous one finished. Now they both could be modifying the same data. Yes granularity level in this is at code block level between IO points not assembly instruction, as the system grows large this problems becomes apparent.

I had to deal with it in a Python Twisted based framework. There is Twisted Semaphore and I had to use it.

The node.js and reactor-based event loops look _very_ nice in small demos and when the callback chain is shallow. HAproxy or nginx are good examples of this. They have shallow callback chain. Node.js demo example also look good, "Oh look you can serve 'Hello World' in 5 lines on a websocket!'" stuff like that.

As systems grow larger, callbacks chains as concurrency mechanisms start to suck.

ulisesrmzroche12y ago

It's all async under the hood because, in the end, you can only really resolve one thing at a time. Just because generators and promises let you write a more synchronous style of programming, essentially sugar, it doesn't mean that it's actually working in a true sync fashion.

greatsuccess12y ago

When people start describing how Node works rather than bloviating blind praise, the first thought I have is, Windows. Yeah thats how windows works too. Im totally sold now!

cmbaus12y ago

Using synchronous I/O is only simpler if you never have to synchronize threads, which you always do.

After taking a closer look at Discourse, it seemed to me that there was something amiss about how the Ruby community handles concurrency -- non-blocking servers (thin), with blocking database I/O (ActiveRecord), behind a round robin load balancer (nginx), combined with external processes for asynchronous tasks (sidekiq). That is a lot of tooling to handle concurrency. Comparatively I think Node has some advantages.

I am a proponent of Node, not because I think it is perfect, but I do think it does many things right. If you are building a significant web app you can't avoid JavaScript. And it is nice to be able to use and move code between the client and the server.

Also, this might sound weird, but I think there are some benefits to forcing developers to think differently about blocking vs non-blocking operations, since there are magnitudes of performance differences between the two. I think the outcome will be better architected and performing applications, but that's just a hunch.

Maybe Erlang and Go do concurrency right with message passing and light weight threading models, but they haven't taken the web development world by storm either. I think Go has a good chance, but time will tell.

rdtsc12y ago

> Using synchronous I/O is only simpler if you never have to synchronize threads

No, but you have to synchronize callback chains, which is a lot worse in a large system. Otherwise it is kind of a tautology, well you don't have threads so you can't synchronize threads. But one main reason to synchronize threads is to prevent shared data from being corrupted from concurrent access. But now also the business logic is split based on IO points.

> Also, this might sound weird, but I think there are some benefits to forcing developers to think differently about blocking vs non-blocking operations,

No doubt it is important to be aware how these things work probably down to libc level. What makes node.js tick, v8 and libuv, what do those do, and so on.

> since there are magnitudes of performance differences between the two.

Completely disagree. There are no general magnitudes of performance differences for all applications between those two. There are application specific and architecture specific.

> I think the outcome will be better architected and performing applications, but that's just a hunch.

Spreading business logic across callback boundaries or sprinkling yields or thens, is not always a good way to handle things. It is in some cases but as a general approach, I think it is pretty bad.

> Maybe Erlang and Go do concurrency right with message passing and light weight threading models, but they haven't taken the web development world by storm either

One reason is because people don't understand the underlying technology and its limitations or are just not aware about other programming paradigms (especially when handling distributed and concurrent issues).

So I tend to not go along much with "well many are doing things this way so it must be better". Over the years I believe more that just following the "many" crowds get one an average result. Your stuff is just as broken or works just as well as anyone else.

That is why it is important to look at Go, Erlang, Rust, Haskell, Prolog etc. There provide new ways of thinking that could help you accelerate faster than the rest of the crowd.

bascule12y ago

> That is a lot of tooling to handle concurrency. Comparatively I think Node has some advantages.

Node forces everything into a single-threaded event loop. That's great if you're writing a chat server. It's not so great if you want to do something that actually uses the CPU.

Ruby has threads (which execute in parallel on multiple CPU cores with JRuby/Rubinius), async I/O, and ways to build hybrid systems out of them cleanly, like Celluloid

1 more reply

ephemeralgomi12y ago

Promises are great for replacing callbacks in the sense of "Here is a function I want you to call once and only once". You can't convert the callback parameter of http.Server.listen() to a promise.

I don't really get the confusion. First-class functions let you do all kinds of great things - I can do things like map across a list to generate a list of functions which I then pass to somewhere else that can left-fold them, which I have trouble visualizing how I would do in a pseudo-synchronous language.

I understand that people have strong preferences for how they like to write code and that's why it's great that there are tons of programming languages out there. Javascript is one of them.

nawitus12y ago

According to Wikipedia, Go's concurrency-model is not safe, and that's a huge problem in my opinion:

"There are no restrictions on how goroutines access shared data, making race conditions possible. Specifically, unless a program explicitly synchronizes via channels or mutexes, writes from one goroutine might be partly, entirely, or not at all visible to another, often with no guarantees about ordering of writes.[36] Furthermore, Go's internal data structures like interface values, slice headers, and string headers are not immune to race conditions, so type and memory safety can be violated in multithreaded programs that modify shared instances of those types without synchronization."

rgbrgb12y ago

You know you can write code with race conditions with nodejs too though, right? 2 callbacks called in the same closure can modify the same variable. I love node but after correcting many, many errors like this in intern code I question whether aync callbacks are strictly simpler than multithreading.

rdtsc12y ago

Use Erlang or Elixir then. They have truly isolated process heaps. Processes are lightweight and preemptively scheduled. You can spawn 100Ks of them on a single machine, each with its own isolated heap that can crash and not affect the rest of the system. That is kind of amazing.

irahul12y ago

Idiomatic go concurrency uses channels for synchronization and passing data around. You shouldn't be writing to shared data from multiple co-routines.

1 more reply

bascule12y ago

Sidebar: Rust solves this problem by tracking the lifetime of memory throughout the program, ensuring there's no unsafely shared mutable state.

EGreg12y ago

For this, Node.js is supposed to run several processes. While one is tied up, another one works. This is almost as good as erlang actor model. You can have a small scheduler library figuring out who to send the next request to.

What I do think is that Node.js processes should be expected to crash at any time. Because a Node.js process serves many requests, it can run out of memory, or anything else. Node.js processes should be able to be restarted instantly, with the consequence of only a few dropped requests (in fact, the requests should be retried if they fail due to a crash, before reporting a failure to the client).

robgering12y ago· 6 in thread

I understand that Node.js is really popular here and that my opinion may be an unpopular one. Maybe this will change, but I cannot find a use case for Node where I wouldn't choose something else.

If I want to prototype (performance is not a constraint), I'd use Ruby (or Python) before Node. If I need concurrency or if performance is a constraint, I'd choose Go (or Clojure, or Scala, or Erlang) before Node. I understand the argument about using JavaScript throughout the stack, but I see this as specious at best. Front-end and back-end development require different domain knowledge, not just language proficiency.

Perhaps using Node opens a project to a larger community of developers. But under this argument, using C or Java also seems a good choice.

ef412y ago

So I'm a node critic, and I think both the technology and the community suffer from technical immaturity.

But I actually use it and depend on it, for precisely the shared-codebase reason. And it helps immensely for my use case.

I actually started out with Ruby on server side, but spent increasing amounts of time duplicating the same code in both languages. Switching it all to Javascript was a demonstrable win.

When people argue it's a specious benefit, it just means they haven't tried to build a truly ambitious client-side application. If performance, consistency, and security matter you're going to have a lot of code that needs to run at the client for responsiveness and at the server for safety & security. And if you care about offline-capable applications, the problem because even more acute: your clients are now a fully distributed system with their own state, and it's easier to write all the synchronization and recovery logic once, and run it everywhere on both clients and servers.

robgering12y ago

This is great feedback. Thank you.

bryanlarsen12y ago

Here's an example: http://clara.io

Obviously this a fairly heavy client application. But imports, exports and renders are server side operations. They manipulate the scene graph using the same code that the client side uses to manipulate the scene graph.

Live render is a great example. When you start a live render, we start a client on the server, but instead of translating the scene graph into WebGL, it translates the scene graph into V-Ray. Any changes you make to the scene are sent to the server-side client using the same mechanism used to synchronize changes between 2 users editing the same scene.

If we used a different language in the front end and the back end, we would have ended up with a lot of duplication of effort.

angersock12y ago

Very cool!

I started working on a winged-edge WebGL editor--really happy to see other people getting this stuff to work!

1 more reply

robgering12y ago

Interesting. Thanks!

yaur12y ago

I haven't done more than fiddle around with Node and I'm not keen on the idea of JS on the server for basically the same reasons that you already stated, but if you are writing a SPA and you need to be able to generate the same views on the server for clients that don't run javascript (like googlebot) I believe that Node makes this relatively easy.

dap12y ago· 5 in thread

There's no sane possibility of a hybrid for Node.js. The concurrency model in JavaScript is built for one thread [that runs JavaScript]. For servers, in such a model, everything has to be async. A 20ms synchronous operation means 20ms of time that you cannot handle new connections or process others that have sent or received data. It's critical to the success of the platform that almost everyone insist on async-only code. Recall that the asynchronous model has been done before by Twisted and EventMachine, but they have exactly this problem: there's tons of Python and Ruby code out there that's synchronous and if you accidentally use it (even by using a dependency of a dependency), the server falls over. In Node, it's very hard to accidentally use it, by design: you have to write an add-on or use one of the few *Sync functions, which are well-known as red flags in a server.

In terms of why this model is compelling: it scales well, and it's a relatively simple model of synchronization. The 5-line HTTP server on nodejs.org scales to a very large number of requests per second. Having spent a lot of time in the heavily multi-threaded C world (and loving it, btw), it's much harder to shoot yourself in the foot with JS and this model, and the failure modes are usually less severe. (Many C programs use this model as well, including haproxy and much of the kernel.)

It's not all win. Blocked operations are harder to debug. But you can work around things like that, and overall it's a nice environment. And FWIW, once you're familiar with it, the waterfall pattern quickly becomes as second-nature as writing synchronous code. (More complex ones are still tricky, but the basics do become second-nature.)

[edit: I meant that there's no sane possibility of a hybrid between synchronous and asynchronous semantics. That doesn't mean you couldn't use a language feature to make async code look synchronous. That's a much more subjective discussion, but I prefer explicitness over compiler magic.]

ef412y ago

This is precisely the fallacy I'm talking about in my top-level comment.

You're conflating asynchronous implementation (which I agree is good), with structuring the code in continuation-passing style (which is bad, for a lot of reasons).

It's all just plumbing, and people demonstrably do it badly if we judge by how well the typical node module deals with unexpected failures. The official node policy on exceptions is that you should let the process die. Not because it's impossible to write exception-safe code, but because almost nobody does it, because it's too hard to do when you're plumbing it all by hand.

rpedela12y ago

Do you have a reference for the "official policy" statement? Any experienced, systems developer, regardless of language, would never think that was a good thing.

In addition, it is not hard to handle the errors due to callbacks. The Node variant of Javascript is basically dynamically typed C and error handling actually follows a very similar pattern to what is standard practice in C. Fault-tolerant systems are hard, period. But since the most important fault-tolerant systems (cars, planes, etc) are all written in C. I do not see why it is "too hard" to build a fault-tolerant server with Node since error handling follows a similar pattern.

Just to be clear, I am not advocating the use of Javascript for automobile software.

2 more replies

dap12y ago

> It's all just plumbing, and people demonstrably do it badly if we judge by how well the typical node module deals with unexpected failures. The official node policy on exceptions is that you should let the process die. Not because it's impossible to write exception-safe code, but because almost nobody does it, because it's too hard to do when you're plumbing it all by hand.

If it needs to be said, that's two (three?) enormous overgeneralizations and one wrong statement. (Can you point to the official policy on all exceptions allowing processes to die? I don't believe that's what the official documentation or best practices have ever advocated.) The vast majority of operational errors in Node.js are Errors but not exceptions, so your statement is also misleading in that way.

ulisesrmzroche12y ago

Whoa, no. You pass errors as the first argument, but that's about as much convention as there is. There certainly isn't anything official or whatever on what to do with them, here or in any other framework.

1 more reply

justincormack12y ago

Indeed, look at Openresty which is async but written as if the code was blocking, using coroutines.

1 more reply

EGreg12y ago· 4 in thread

Um, am I the only one who considers Node.js a beautiful tool suited for developing servers?

Why async by default? Because defaults should encourage best practices. It can handle C10K problem out of the box. With libraries like Q, the code can actually be rather easy to follow and maintain.

One of the biggest reasons why a multi-user server should be coded in async-by-default style is as follows: often, you want to obtain objects and not care about how they are obtained. In procedural languages, this isn't possible. Consider the following challenges:

1) Query sent to 10 different shards, wait until they all return, combine the results in the app, and then use them. In a synchronous language, you HAVE TO go to one shard, then another, then another, and the time is 10x longer. In an async style, it's only as long as the longest query!

2) Some of the objects may have already been cached, whereas others need to be obtained. In addition, the ones that need to be obtained shouldn't be requested more than once at the same time, but callbacks should be placed on a waiting list.

3) Various parts of your code (e.g. from different requests) might want to request the same object. Not only that, but you might want to batch your requests. That is to say, wait 10 milliseconds and see if any other (unrelated) parts of your code also want to grab some objects from the same DB table. Suddenly you are issuing a lot less queries because you can build middleware like this easily in JS. Not so in PHP for example - I would know. A lot is wasted there.

4) Let's consider #1 some more. The queries may go to other HTTP endpoints - do you really want to wait that long for I/O? Furthermore, even if they are going to local MySQL shards, your site becomes slower with an extra factor of O(log n) with the number of users. Kind of like when you use a relational database instead of a graph database for one-degree-away lookups. It could have been O(1) but instead you introduced another factor of O(log n). Not terrible, but certainly gives you a 5-20x cost down the line.

bascule12y ago

> In a synchronous language, you HAVE TO go to one shard, then another, then another, and the time is 10x longer.

Have you ever heard of threads?

dharbin12y ago

Threads are more difficult than writing event-loop based code IMO

2 more replies

EGreg12y ago

In PHP and other scripted web server languages? Please show me how it's done.

PHP does preforking first of all. You have about 30 "threads" sitting around, able to handle 30 clients. This is STILL not the same as evented programming, which lets you send out these requests and wait.

Speaking of -- have you ever heard of evented i/o?

Let me put it this way ... threads and workers are good for handling incoming requests (one worker per request). But for outgoing requests, it's nice to have evented i/o!

1 more reply

nickfargo12y ago

Referencing C10k implies yes. Threads cost.

1 more reply

bitcrusher12y ago· 3 in thread

I have a hard time 'listening' to these rants about how hard it is to reason about asynchronous code. It's NOT hard, it's DIFFERENT.

This isn't some new Node-Sauce. The idea of an event-loop asychronous programming domain has been around forever in various forms, first in games and desktop GUI programming and the way one reasons about it hasn't changed, just because it has been applied to a new class of problems.

I suspect that there are two things going on here:

1. A LOT of people dislike Javascript. As a result, they're starting with that negative point of view and then finding reasons to support it.

This is not unlike people who argue about Python white-space. They're really saying "I don't like Python." The white-space issue is incidental.

2. Lack of experience / exposure

I suspect that folks who have trouble with this, frankly just aren't experienced enough. I'm not saying they aren't skilled, just not experienced in this realm.

Once upon a time, these same sorts of 'arguments' were used against Ruby ( ala the Rails explosion ). Nothing much has changed about Ruby, other than it has managed to be around long enough to influence a new generation of programmers.

jerf12y ago

Yes, it is harder. You are attempting to reason about code in the absence of a useful stack trace. You're essentially taking your structured programming language and throwing it away and returning to the spaghetti-code era of code flow (this time with islands of structure). Both automated and human analysis is legitimately harder under these circumstances.

"2. Lack of experience / exposure"

Quite the contrary. I've had abundant experiences with both event-based models (which hardly started with Node!) in several paradigms (GUI, server, network) and models like Erlang, Go, and Haskell. It is no contest. The latter is sane. The former is not. If you have the opportunity to choose freely, and you choose the event-based model, you have chosen poorly.

Believe me, it is not the critics of Node who are the inexperienced ones. All the evidence I've seen points strongly in the opposite direction. Node advocates could start by getting their understanding of how their competition works out of the mid-1990s.

Ironically, in the end Node will eventually work their way around to working like these models. The signs are all there, the parallel evolution is clear, and apparently the process has already begun with Meteor. I wonder if the community will ever acknowledge how crazy they've been as they careen from one "solution" to the next at breakneck speed, all the while crowing about how much better their stuff is, even though what they consider "their stuff" can't seem to stay stable for 3 months?

bitcrusher12y ago

"Quite the contrary. I've had abundant experiences with both event-based models (which hardly started with Node!) in several paradigms (GUI, server, network) and models like Erlang, Go, and Haskell. It is no contest. The latter is sane. The former is not. If you have the opportunity to choose freely, and you choose the event-based model, you have chosen poorly."

Nonsense. First, if you read what I said, I explicitly stated that this IS NOT new and has been around forever in many different forms, even citing some of the same domains you did ( GUI ).

Second, everything you state here can be summed up with "I don't like it." Your preference for the models in Erlang, Go and Haskell are opinions, nothing more. I agree with you that there are some interesting approaches to different event-systems, across many different styles and languages, but there is nothing inherently superior to one over the other, beyond "taste".

"Believe me, it is not the critics of Node who are the inexperienced ones. All the evidence I've seen points strongly in the opposite direction. Node advocates could start by getting their understanding of how their competition works out of the mid-1990s."

I don't believe you, because I've been programming since the late 80s/early 90s, and I think Node is a fine choice for building certain types of software. I've been around the block more than once myself, through a ridiculously long list of programming "shifts".

The things you're espousing about Erlang, Go and Haskell's superiority were the same things that C/C++ programmers were saying about Java in '95.

"Ironically, in the end Node will eventually work their way around to working like these models. The signs are all there, the parallel evolution is clear, and apparently the process has already begun with Meteor. I wonder if the community will ever acknowledge how crazy they've been as they careen from one "solution" to the next at breakneck speed, all the while crowing about how much better their stuff is, even though what they consider "their stuff" can't seem to stay stable for 3 months?"

I don't know where this 'community' is, but frankly, most communities have advocates that talk too loud about things and make ridiculous claims about things. I believe it was Scott McNealy who said "I don't know what your question is, but the answer is Java". These things have nothing to do with actually writing code and creating good/usable software. I submit that you should ignore these people and focus on working with tools that help you write software.

If Node does not help enhance your ability to create software, it's not the tool for you. That doesn't mean that Node is inferior or stupid or useless, it just means it's not the right tool for YOU to use.

rdtsc12y ago

> It's NOT hard, it's DIFFERENT.

Almost agree but not quite. From my experience it is _often_ a worse approach and using built-in concurrency units (Erlang actors, goroutines, even threads with queues).

Small demos or applications with very short callback chains benefit and look good with this type of concurrency.

You are right this is nothing new. HAproxy and nginx are very fast, very heavily used applications, they use an async event loop (select/epoll) type reactor. But that is a good fit for them because, their callback chains are very short and they don't do that much CPU heavy processing.

But as applications get larger and these callback chains start forking into errback and get deeper and deeper this is not a good paradigm for design.

Anyway just my two cents.

woola12y ago· 3 in thread

Having written nodejs for about an year I would never choose it again if I have given the choice.

* No Error handling. The process has to be restarted if something goes wrong.

* No thread local storage equivalent. As a result you can't do simple thing like differentiate log originated from different http requests.

* Callback makes the code unreadable

* Javascript has no real method or in other words 'this' is not attached to a function and is determined by way the function is called. So you have to wrap most of the callbacks with _.bind()

* Mixing callback, promise and event emitter api introduce lot of boilerplate code.

* No stacktrace. Debugging error is PITA

* No preemptive scheduling. So you have to be extra careful that you never spend too much CPU time.

* No easy way to handle back pressure.

MrBuddyCasino12y ago

These are exactly the reasons I have stayed away from it until now. I don't understand this one though, what does it mean?

* No easy way to handle back pressure.

nostrademons12y ago

I did some Googling. "back pressure" in the Node context is basically what most people refer to as "flow control" or "overload behavior". Node programs work by enqueueing a bunch of work items for the event loop to handle, and then going on and doing a bunch more stuff until all the work items have been handled and their callbacks have been called. In the process, those callbacks may themselves enqueue further work items.

The problem is that all these callbacks themselves take up memory and hold onto references in the V8 heap. And so if callbacks are being enqueued faster than they can be processed, the heap will get more full. V8's GC takes time proportional to the amount of live data, and so GCs will become more frequent and take longer. This further reduces the amount of available CPU time to process callbacks, which increases the amount of live data, until the system thrashes to a halt.

Haskell programmers (who use a similar scheduling mechanism for lazy evaluation) call this a "space leak", and it's a big problem in Haskell as well. The solution is for the runtime to let the calling code know that it's overloaded, and then either temporarily block the calling code from creating more callbacks until existing ones have run, or let the calling code gracefully degrade and choose to handle things in a simpler fashion.

1 more reply

Detrus12y ago

Latest developments of Node Streams begin to address back pressure.

sheetjs12y ago· 3 in thread

node.js does not impose it for everything. Every command in the fs library that has a natural synchronous interpretation (like reading a file, writing a file, ...) has a sync equivalent (readFileSync, writeFileSync, ...)

camus212y ago

Try to do that on a production server that has to handle 1000 requests readFileSyncing at the same time ...

sync apis are just a conveniance for console applications, they are not meant for a web server.

So yes node.js DOES impose async programming for everything or node doesnt scale.

babby12y ago

Sync is great for initialization. Due to the fact you only start up once, you can be as reckless with your performance hindering practices as much as you want as it will never amount to anything more than a few hundred extra milliseconds of startup time.

1 more reply

ef412y ago

True, and that helps for cases where you're willing to completely sacrifice concurrency.

But there's no reason we couldn't have syntactically synchronous code that still runs concurrently.

bitwize12y ago· 2 in thread

Here's a fun fact that every Amiga programmer seems to understand that Unix programmers just never really got: actual computing hardware is inherently asynchronous and interrupt-driven. The whole Amiga system is structured around using the CPU to program the custom chips and initiate DMA transfers, and installing interrupt handlers that perform the next phase of the program once the custom chips finished their work. It's how shit got done on the Amiga. By contrast, the lack of good support in early Unix for asynchronous I/O is said to be one of the reasons why X11 is such an ugly pile of hacks.

Even on Unix, if you wrote a server procedurally what you'd end up doing is having main() start a select loop that... dispatches incoming events to callbacks. All Node does really is to abstract away the select loop -- one less thing for the developer to worry about. The Node way is simpler than the conventional C way.

So no, there's nothing inherently superior about linear, procedural code, especially when such code is not at all how a computer system that must interact with the outside world works. Whether asynchronous, callback-driven code is confusing or not depends on what background you come from. Put an old-school scener in front of Node and he might form an entirely different opinion about it from OP.

nostrademons12y ago

I think the question the article is asking is "Why did did Node stop at abstracting the event loop away, when it could've also abstracted the execution context?" There are languages that do this - Go, Erlang, Haskell, and Python 3.4 all provide a sequential process/coroutine abstraction on top of an implementation that's fundamentally asynchronous.

The question the author has isn't about implementation, it's about programming model. Programming languages & frameworks are ultimately made for humans, and most humans find it easier to reason about "Do this thing, then this thing, then these other things, until you're finished" than "I'm doing all these things at once!" We have computers to organize all the events, stack frames, wait queues, etc. that take us from the machine to the editor window.

bitwize12y ago

What I'm saying is that a programming model that's confusing to you is perfectly natural to someone else who comes from an environment where that model is all you have. People have used interrupt/event handlers and callbacks literally since the dawn of computing. It's nothing new. It just has to be learned, the way programming itself has to be learned.

1 more reply

pkinsky12y ago· 2 in thread

Asynchronicity is great, but javascript doesn't provide the tools to do it properly.

> (...) try writing a function call that requires information from two separate HTTP API responses; I basically need to draw a diagram of what happens with async.waterfall for a task that, given synchronicity, would've been solved with a trivial three-liner.

It's a lot easier to handle asynchronicity in strongly-typed languages with monadic for comprehensions. In Scala, for example:

    val res: Future[Foobar] = for {
      a <- makeHttpRequestA()
      b <- makeHttpRequestBFromA(a)
    } yield new Foobar(a, b)

You can then register callback functions which operate on either a Success[Foobar] or a Failure[Throwable].

Synchronous code isn't simpler, it just hides complexity like caltrops in tall grass.

(slightly modified version of my post on Github)

eldude12y ago

    co(function *(){
      let promiseA = makeHttpRequestA()
      let promiseB = makeHttpRequestBFromA(yield promiseA)
      var res = new Foobar(yield promiseA, yield promiseB);
    })();

densh12y ago

with scala.async [1]:

   val res = async {
     val a = await(makeHttpRequestA())
     val b = await(makeHttpRequestBFromA(a))
     new Foobar(a, b)
   }

[1] https://github.com/scala/async

meryn12y ago· 2 in thread

I posted this as a comment on the gist:

It's not about performance, it's about <del>parallelism</del> concurrency.

If you don't care for Node's ability to handle multiple requests <del>in parallel</del> concurrently, with a single process with only one thread, than you might not have a need for Node at all. Why not use Ruby or Python?

If you really want to force Node into being a dumb scripting engine, you could use the various sync functions that Node.js provides. Problem is, they block the entire thread. But if you want to, you can. And I believe with some clever hacks, you can use the blocking nature of (for example) the fs sync functions to make other calls (including http requests) synchronous as well. But unless you use Node to build something that's not a server (people build command line tools with it as well), I don't think you want to.

By using synchronous functions, everything Node is built for will go down the drain. Everything done will be done in sequence, and during the time the sequence of operations has not finished, the Node runtime will be unavailable to serve any requests.

I suggest you look into promises as a somewhat saner way to deal with asynchronicity than callbacks, although you can pretty far with just the Async library. The future might be in Generators, a new feature of EcmaScript 6. This is available in V8. You can use this in Node 0.11 (unstable) if you use the --harmony flag. See here for a nice writeup: http://blogs.atlassian.com/2013/11/harmony-generators-and-pr...

bascule12y ago

You're confusing concurrency and parallelism. Node is a single-threaded event loop. It can only do one thing at a time. Node can perform many operations concurrently, but they're not running in parallel. If any one operation blocks, the whole event loop stalls.

Parallel execution requires multiple threads running on multiple CPU cores. Node requires multiple VMs to do this.

meryn12y ago

Thanks for explaining difference between concurrency and parallelism so clearly! I'll remember this.

I of course realize that Node doesn't truly handle things in parallel. It's a single thread, after all.

erjiang12y ago· 1 in thread

Please stop asking questions like this about Node.js.

The superfluous obtuse complexity that node.js forces is what's keeping ninja programmers like us employed. If we don't keep switching to strange new things every few years, our wages will stagnate as there's less separating our rock-star tech from lower-salaried uncool "enterprise" software.

Being able to manually manage continuations rather than using a language that does it for us is a high barrier that keeps us in demand, the same way that having to manage memory manually keeps people from becoming C programmers. Losing this edge is not something we should encourage.

Just imagine - if all of these "learn to code" programs actually work, the market will be flooded with new programmers. Thankfully, if we make sure that they only learn easy things like Python, we can continue to convince the world that things like Node.js and MongoDB are the future. While the rest of the world is still catching up on basics like Python and Postgres, we'll be one step ahead of them.

People questioning Node.js like this more and more are a sign that we need to find the next hotness and move there if/when node.js sinks. If we can convince people to build more things in the next hotness, then we can get in early before the masses pick up on it, ensuring that we can maintain our spendy lifestyles and free lunches in SF.

(/s)

ebiester12y ago

Node is passing the Peak of Inflated Expectations and into the Trough of disillusionment. Rails is firmly in the Plateau of Productivity.

There are some genuinely interesting ideas coming out of node, and I think it's unfair to just call it obtuse complexity. I think that node.js has been a call to return to small, focused modules that will percolate back up.

An attempt to keep things small and focused is nothing new -- Unix started the trend in the 70s. Everything old is new again. I think they're forging some new patterns that take advantages of small, composible blocks through NPM in a way that CPAN and the like were never quite able to exploit.

I think node has an exciting opportunity, coupled with single page apps and browserify, to be a common front end to disparate web services. And, to me, that's exciting.

neya12y ago· 1 in thread

I was reading somewhere (maybe Hackernews itself?) that synchronous programming and/or blocking are actually features and not disadvantages. After working on a huge Node.JS project, I regret taking on anyway (I had to do it for the moolah, aka money). I am now convinced that migrating from Rails to Express (A node.Js framework) was a total mistake. Please note, I do not mean to advertise rails in particular, infact, you can replace rails with anything else thats fits the analogy - perhaps maybe a python framework.

However, this mistake has taught me something(s) very valuable -

1) Don't trade in a tried and tested framework for something like Node.js, that will open up a different level of problems altogether.

2) No matter what language you use, you WILL face issues at scale. Be it Node.js/Scalatra/Rails. You might as well face it with a tried and tested framework.

3) Don't re-write your own framework in something like Go or Haskell or C or C++ or etc. thinking it will be scalable, etc.

You will -

a) Waste massive amounts of time re-inventing the wheel,

b) Open yourself to possibly dangerous security vulnerabilities (Eg: Handling encrypted cookies),

c) Waste your time on things that don't really matter instead of on something that actually matters - The product itself.

4) Anything that you can scale by throwing more money at it is for the win[1]

Why am I talking about scalability, when the discussion is about Async. vs Sync. programming? Because of the reasons many people cite using Node.Js and the like. It's just not worth it. Focus on the product first and the architecture later. This comes from someone who wasted a year of his life doing everything above.

[1]http://highscalability.com/blog/2013/4/15/scaling-pinterest-...

mclenithan12y ago

Sorry you had a bad experience, bud. I am on a team of six, all pretty experienced with Node, and we don't have those issues. We've been going production on it for about a year and a half now. People cried a little in the beginning because it was different but now that we have embraced (and love) callbacks, we rip through development pretty quick. Things aren't always done for you but it's getting pretty rare at this point.

clux12y ago· 1 in thread

Asynchronicity isn't an optimization, it's a simplification. Blocking code and threads are hard to reason about, but callbacks are simple. You still have to deal with errors in a synchronous world, and an evented system allows you to deal with branching event possibilities sanely.

ef412y ago

Yes, pre-emptive blocking threads are hard to reason about.

But cooperative blocking threads are just as easy to reason about as node's callbacks, while being significantly easier to structure and make fault-tolerant.

pessimizer12y ago

I can't speak to node (don't use it), but in Erlang, I find async far easier to reason about than sync for the simple reason that I don't have to plan for what will happen if some operation over the network breaks during a request.

In my head, there's no difference between async and sync. A synchronous call is just an async message + and async acknowledgement of receipt. If I write it that way, there's a lot of failure modes that I don't have to think about.

Also, in my head, concurrent code looks like a sparse group of people spread over a large, grassy, hilly plot of land shouting at each other from differing heights and many yards distance over the whine of shifting winds. I find it easier to reason about the shouts alone, instead of shouts + their responses, because a response is nothing but another shout.

sync12y ago

Async code is super easy to reason about when using ES6 Generators, which are available now in node v0.11.

Here's both a series and a waterfall example: https://github.com/visionmedia/co#example

christiansmith12y ago

The gist reads like someone looking for an excuse to give up and move on. Every time I run across someone bitching about Node's programming model I'm reminded of "The Blub Paradox".

http://paulgraham.com/avg.html

I have to admit that learning to think in callbacks was initially an exercise in frustration. But it wasn't that hard.

Most of the pyramid-type issues can be resolved by applying principles that are also valid in synchronous environments (like modularity).

If modules and named functions aren't enough, you can subclass EventEmitter to further decouple. For more complex cases there's https://github.com/caolan/async, which makes many nontrivial async problems relatively easy.

Why is it so hard for so many people to appreciate Node for its merits? The code isn't going to execute in a perfectly linear way, so why should it have to be written as if it were?

angersock12y ago

So, here's how I think about it:

Any program, of any type, has a synchronous story to it: data came in, then this happened to it, then this happened to it, then it left. If you follow the program execution, this is how it works.

It's very easy to reason about, right? And it's very easy to program this way.

One of the biggest breakthroughs in computer engineering was the idea of multiple concurrent processes (and nevermind that behind the scenes they were still running on a single processor!), and then later still true concurrent multi-processor architectures.

We developed operating systems and compilers and runtimes specifically to let us take advantage of that without the having to break from easy-to-reason-about synchronous code.

Node said, hell with it, everything is async all the time. This is really annoying at first, because there are many many times where sync is the best way to reason about something.

That said, there is some beauty in such a hard-line approach: if you are truly async, handling an error is the same as handling a task which takes too long. Once you've paid that price properly, you're fine. (This is also what makes Erlang kind of cool.)

The big frustration is that there really ought to be a way of writing synchronous-looking Javascript code (ES5) which--behind the scenes!--is multiplexed and made to run concurrently and without blocking. The mechanisms are all there for programmers to deal with exceptional failures (exceptions, anyone?), and so are synchronization points (functions having multiple arguments should probably join on the evaluation of those arguments before being invoked--that's something the interpreter should be doing already).

I guess what I'm saying is...why don't JS interpreters do lazy evaluation and late binding?

jwarkentin12y ago

It seems that OP thinks the asynchronous nature of JavaScript was meant as a performance optimization. Let's go back to its origins. It was originally a scripting language for the browser. If it were to be synchronous and/or allow blocking things, such as `sleep()` or whatever else, it would have created an unusable web experience. Web pages would be constantly locking up. It's only recently that it's moved to the server where the asynchronicity on I/O isn't so mandatory.

I personally still think it's a good thing, but that could definitely be debated on the server side. Unfortunately, if you want to have the advantage of writing one code base that can run on the browser or server, then it must work the same in both places.

sergiotapia12y ago

https://www.youtube.com/watch?v=bzkRVzciAZg

morganherlocker12y ago

> If asynchronous code is harder to reason about, why would we elect to live in a world where it is the default?

Because if you do not make it the default, then most libraries you use will make the choice for you by going sync.

mattgreenrocks12y ago

My big beef with Node.js: it could have avoided the sync/async split entirely by using a synchronous facade atop an asynchronous runtime...but it didn't. Why would you create an entire server platform and not use the chance to simplify the most common use case: fetching data from a database?

I see as a great opportunity completely squandered so that it would cater to the masses: "you don't have to learn anything new!"

Sometimes I think web tech likes being as shoddy and thrown-together as possible.

Cless12y ago

You can have an async implementation and synchronous code. http://meteor.com does that part quite nicely. The compiler or runtime should be handling this kind of cruft for us in most cases. Just because using Brainfuck might result in theoretically faster code than Ruby doesn't mean we should be writing our server in Brainfuck.

mistercow12y ago

>but I'd be very curious to hear a defense of "async-first" thinking for problems that are typically solved on the server-side

Because if your web server blocks every time it accesses a file, or talks to a database, or communicates over the network, then it won't take many requests at the same time before it's completely bogged down.

WhiteNoiz312y ago

My argument would be that if you go 'sync first, async if you feel like it', everyone will just do sync, because it is easier conceptually (or atleast more familiar to a lot of people).

anuraj12y ago

In browser content may need to be asynchronous, but HTTP is request/response - nothing asynchronous. Half the web (or more) is not about feeding the browser. Use asynchronicity where needed.

deividy12y ago

Am I the only dev in the world that loves async?

schrijver12y ago

As a designer I have been trying to dig this stuff. Server side JS seems like a huge win for all kinds of small websites and interesting ui experiments, because one can stay in one language and avoid code duplication.

Yet in trying to set up some experiments that use a server-side git integration, I bumped into the kind of code described in this thread, and I find it super difficult to understand the async way of doing things.

Here’s an example from the homepage of nodegit, which seem to be among the best maintained libraries that wrap libgit2. Thus I imagine it represents idiomatic node.js What the code does, is console.log details of all the repositories commits, git log style.

  // Load in the module.
  var git = require('nodegit'),
    async = require('async');

  // Open the repository in the current directory.
  git.repo('.git', function(error, repository) {
    if (error) throw error;

    // Use the master branch (a branch is the HEAD commit)
    repository.branch('master', function(error, branch) {
      if (error) throw error;

      // History returns an event, and begins walking the history
      var history = branch.history();

      // History emits 'commit' event for each commit in the branch's history
      history.on('commit', function(error, commit) {
        // Print out `git log` emulation.
          async.series([
              function(callback) {
                  commit.sha(callback);
              },
              function(callback) {
                  commit.date(callback);
              },
              function(callback) {
                  commit.author(function(error, author) {
                      author.name(callback);
                  });
              },
              function(callback) {
                  commit.author(function(error, author) {
                      author.email(callback);
                  });
              },
              function(callback) {
                  commit.message(callback);
              }
          ], function printCommit(error, results) {
              if (error) throw error;
              console.log('SHA ' + results[0]);
              console.log(results[1] * 1000);
              console.log(results[2] + ' <' + results[3] + '>');
              console.log(results[4]);
          });
      });
    });
  });

It looks so alien to me! Instead of the tree-walking and looping constructs I know from synchronous languages, we deal with a history that returns events as it walks the tree (btw, why is it that we can launch the history before we attach the event?). Then we have to actually import the ‘async’ module, because, uhm, we want to specify an order by which the different parts of the message gets logged.

How then the actual callbacks work, I’d be happy if someone can explain that to me—why do I pass a function to an attribute of the object? Why don‘t I just return the value?

Because the code is not written in a lineair way, I have to keep looking back and forth between callbacks to get a mental picture of what it does.

For the record, what this translates to as synchronous Python:

  from pygit2 import Repository, GIT_SORT_TOPOLOGICAL
  repo = Repository('.git')

  for commit in repo.walk(repo.head.target, GIT_SORT_TOPOLOGICAL):
       print 'SHA %s' % commit.hex
       print commit.commit_time * 1000
       print '%s <%s>' % (commit.author.name, commit.author.email)
       print commit.message

I imagine that, as some of the commenters note, I will be able to start reading the async style if I get enough exposure to it. I wonder though, if it does not make the barrier to entry for server-side JavaScript too high for non-full-time programmers.

In that sense it is interesting that the Meteor team chose to use the package Fibers, that allows one to write node.js in a procedural style. I imagine something like that is necessary for server side JS to become really mainstream in web design…

notastartup12y ago

The creator of Node.js said it best himself, that his afraid people associate Node.js with scalability out of the box due to the single threaded asynchronous nature, that there exists something magical about it that would suddenly make your app scalable because it's riddled with async code.

I now use Flask and Python instead of Node.js and it's a total pain reliever. Not having to deal with Javascript is a win on the server side (JS on client side is enough) and the python stack is good enough to handle every bit of Node.js

Truth is, Node.js is in a bubble. I find it that even people who have never programmed before insisting that their new website be written in Node.js.... It reminded me of the dot com bubble, when your grandma would explain to you her rationale for buying dot com stocks (they had no idea).

j / k navigate · click thread line to collapse