This article proposes a "nursery", which is just a wrapped sync.WaitGroup/pthread_join/futures::future::join_all/a reactor that waits for all tasks to terminate/etc.
It then uses an exception-like model for error propagation to "solve" error handling (which is fairly easy to handle with a channel).
The construct is a decently usable, already applied tool to handle a set of problems, but the article takes the issue way out of proportions and overhypes the proprosed solution. The "with" example for benefits to not having a "go" statement seem rather bogus, especially seeing that such RAII constructs do not exist in Go (no destructors, remember?).
Trying to claim that "go" is as terrible as the original "goto" is ignorance of the original problems. Bad use of goto can be a nightmare to track (as the author tried to illustrate), but goroutines do not jump around, they branch from the main goroutine, following normal control flow from there. They are easy to follow, and the language is designed so that you can throw around with them and forget them without them causing you problems.
Also, this article is comparing a list of concurrency constructs and one parallism construct (pthread_create—threading.Thread doesn't count as parallism due to GIL) to callbacks, something which have nothing to do with concurrency at all. Very odd.
Now, while Go is designed mostly for you to not care about goroutines, there are some corner cases where one must know if a resource is used by anything, such as the chase of when you wish to close a file handle.
However, I'd argue that this is not related to go's concurrency model and the presence of background tasks at all. It's related to object lifetimes. This should be made clear at API surfaces.
A language solution to this would be Rust's lifetimes, not a new concurrency model.
Unless they are documented informally or encoded formally.
I think the author proposes a useful way to structure multi-threading. The comparison to goto isn't perfect and he needs to play fast and loose with some terms to keep the analogy working but he makes a good point.
I'm not convinced yet that the nursery pattern should be the only allowed way to start a thread but saying "you should use it unless you have a good reason not to" is a good provocation to get the discussion going.
"I disagree; I don't think x is like goto."
"Now that we've agreed that x is like goto, not wanting to give up x makes you like a big dinosauric dummy~"
However, I find "go" and "goto" to not intersect at all. I've written this many times in other comments on this thread, so I'd rather not type it out again, but the TL;DR: is that "goto" can make understanding a function when read difficult, while "go" is clear when read. No function is understood if never read, and spawning "background tasks" is a core part of asynchronous programming (and thus not an unusual side-effect).
The "nursery pattern" is a decent construct that I have used quite often whenever I felt a need, but it doesn't appear to really solve any issues mentioned in the post. I also elaborated on this quite a few times already, so TL;DR: the only real problem of goroutines is things like references to potentially closed objects, but any method may end up storing an internal reference used at a later call, making the issue not related to concurrency.
You've greatly underestimated how general this problem is.
First, Python's `with` statement has nothing to do with destructors. From the PEPM for `with`:
with VAR = EXPR:
BLOCK
which roughly translates into this:
VAR = EXPR
VAR.__enter__()
try:
BLOCK
finally:
VAR.__exit__()
`exit` is a method, not a destructor. As a result, Go could easily gain a `with`-like construct.Moreso, the issue applies even if there's nothing like `with` in the language. If I'm reading a function definition, and that definition uses the "acquire, try, finally, release" pattern that is the desugaring of `with`, then it sure would be nice to know that nothing from BLOCK is running after the `finally` statement has run. `go` breaks that assumption.
Go does not have exceptions (panic is not meant as "normal" flow control), and therefore has no use for a "with" construct. "defer" is used for a somewhat similar purpose. Go does not have destructors, and therefore has no possible implementation of RAII.
However, none of this applies to goroutines. A goroutine only gives errors if the author decides that such is necessary. If so, it will likely be through an error channel. There is no unexpected code paths through such readout, rendering "with" and RAII useless.
So again, for a post that was very focused on complaining directly about the "go" keyword, I was expecting something applicable to Go.
Goroutines share more in common with method calls (which are a form of branching) and even with single-threaded scenarios, methods can do surprising things - especially if you have shared global state.
Additionally, if you control how data is shared (such as message passing - channels) you shouldn't have to be concerned about what the other thread is doing - so long as it reacts to messages that you send to it.
Method calls aren't really branching. The code path is still totally linear. You could basically copy paste the code in the method definition in place of its call and get the same outcome (not literally but you know what I mean). For goroutines this is not the case.
I think your points about state ownership are right on, but that's kind of the author's whole point. Right now, there are a lot of things you have to be very conscious of to write good code that executes cleanly using goroutines/threads/etc. That is very much the same as writing good code with gotos. It's 100%, unequivocally possible to write good code using gotos (every control flow structure dijkstra proposed can be represented with them), it just requires a lot of added thought, and the potential for mistakes is much higher.
The author is not proposing any functionality that doesn't already exist, just a new control pattern to reduce the chance of creating problems.
I don't think that comparison makes sense. Channels and select are not a property of goroutines themselves, nor is using those the only way to communicate with other goroutines. The actual functionality of a goroutine is very similar to a thread, even if the broader language/conventions push it towards a coroutine/actor.
> Goroutines share more in common with method calls (which are a form of branching) and even with single-threaded scenarios, methods can do surprising things - especially if you have shared global state.
But they can't run in parallel, or data race. Goroutines, like classical threads, can have data races.
> Additionally, if you control how data is shared (such as message passing - channels) you shouldn't have to be concerned about what the other thread is doing - so long as it reacts to messages that you send to it.
This is also true of threads. Message passing is a layer on top of, and agnostic to, some concurrency mechanism. Go might make it slightly syntactically simpler than languages that use true OS threads (much simpler than C's pthread_spawn, but not particularly different to Rust's thread::spawn(some_closure)), but that's a syntax layer.
> (pthread_create—threading.Thread doesn't count as parallism due to GIL)
Eh? Why would you have a GIL? Maybe in Go, but so what, just don't use such a language/run-time.
What.
...and: [whatever] is just a [whatever]
...comes across as a bit of an asshole?Why not just make your point instead of investing a lot in making sure everyone knows that you're smart and the other person is dumb?
Maybe Knuth was just as rude to Dijkstra, who knows?
I felt that such "aggressive" countering was justified considering the equivalently grand claims of the post ("EXTREMELY_POPULAR_CONSTRUCT considered harmful").
If none of your code ever actually waits on the error channel when you spawn a goroutine, what happens?
Is there any circumstance where that behavior is preferable to guaranteeing something must wait on the error channel?
The discussion about error paradigms (exceptions vs. globals vs. error values, forced checking vs. free choice, etc.) is quite a big one, and arguably an entirely different topic.
> The "with" example for benefits to not having a "go" statement seem rather bogus, especially seeing that such RAII constructs do not exist in Go (no destructors, remember?).
I don't really understand what you mean by this. You would certainly need a different mechanism in Go than in Python, but I fail to see how that is a criticism of the argument he is making. He has built one implementation in one language, clearly it would look different in other languages.
> Trying to claim that "go" is as terrible as the original "goto" is ignorance of the original problems. Bad use of goto can be a nightmare to track (as the author tried to illustrate), but goroutines do not jump around, they branch from the main goroutine, following normal control flow from there. They are easy to follow, and the language is designed so that you can throw around with them and forget them without them causing you problems.
Why exactly is branching better? With "go" statements, you can easily end up with goroutines that you aren't even aware of floating around doing things that you aren't aware of. His example of calls in external libraries is probably the best illustration. Currently, you could call what you think is a simple function from a library, and end up with a whole host of goroutines you didn't expect floating around, doing things and using resources. Articles about writing good libraries [0] have to make points about how you need to take care of this stuff.
Gotos create problems when you end up surprised about what code is being executed, making tracing execution paths difficult. Branching creates a different set of problems: most crucially that you can have code being executed and not even realize it's happening. The issues are different, but his analogy is really spot on IMO. Both of these things can be solved with code reading and debugging, but the author's whole point is that with an improved abstraction you no longer have to worry about that sort of thing.
> Also, this article is comparing a list of concurrency constructs and one parallism construct (pthread_create—threading.Thread doesn't count as parallism due to GIL) to callbacks, something which have nothing to do with concurrency at all. Very odd.
These are all related concepts. They are about doing different things "at the same time," just with different definitions of what the phrase means. The problems the author are describing are considerably worse in a scenario where actual processes are being spawned, but the nursery pattern is relevant in all of them. It's about control flow, not capabilities. Even using threading with a GIL can create execution paths that you aren't really aware of until you actually look through the code or debug.
Of course not. It also isn't original, or a "silver bullet" that I find relevant to general concurrent programming.
> I don't really understand what you mean by this.
I was trying to keep my rant a bit short, but my point is that the only practical example for a post that puts a lot of effort into complaining about a core Go construct (it's the title) does not even remotely apply to Go.
This is partly due to Go being shaped around this very core construct.
> Why exactly is branching better?
"goto" can (sometimes, as there are many valid usecases) be problematic as it can make control flow very obscure when read. "go" is extremely clear to read, and only has the concern that you don't know if a given function has created goroutines. It is of course usually described by documentation, or implicit from functionality that such thing will occur, and unlike goto, the code is extremely clear about what is going on if you read it.
Furthermore, whether a function call has created goroutines is by itself not a concern. What can be a concern would be if some types of resources that must be closed (i.e. a file) is referenced after you close it. That is not related to concurrency, but lifetimes.
By lifetimes, I do not necessarily mean Rust-style compiler-enforced lifetimes (which I like), but simply API contracts. A library may store a reference to something you pass, and may use this reference in later calls, potentially after you invalidated the reference by closing a file descriptor.
A joined execution does not even remotely solve this problem, as it is not related to concurrency. It solves a different problem, related to just managing concurrent execution.
> These are all related concepts. They are about doing different things "at the same time, ...
They are not at all. Callbacks are often used together with certain types of concurrency constructs, such as an event-loop that can call callbacks upon various events. In JS, for example, the concurrency construct is a single global event-loop that calls tasks/microtasks.
Callbacks themselves, however, are not a concurrency construct. Thus, comparing them to concurrency constructs is very weird.
(A SAX parser is an example of a non-concurrent use of callbacks.)
> Even using threading with a GIL can create execution paths that you aren't really aware of until you actually look through the code or debug.
It does not create execution paths that are not easily visible (you know nothing about a program until you look through the code), but yes, threading.Thread can lead to some unanticipated behavior.
Note that I was excluding this as a parallelism construct, not a concurrency construct. The behavior will be some form of cooperative multi-tasking.
EDIT: too much "however".
http://www.usingcsp.com/cspbook.pdf
The (simplified, and as I understand it) gist of concurrency in CSP is that the program is expressed as a series of parallel (PAR) and sequential (SEQ) operations.
Everything in a PAR block will run in parallel and all their outputs will be collected and fed as the input to the next SEQ. Everything in a SEQ will run sequentially as a pipeline until the next PAR. Every PAR must follow a SEQ and vice versa, as two PARS or SEQS next to each other will simply coalesce.
eg.
PAR
longCall1
longCall2
longCall3
SEQ
reduceAllThreeResults
doSomethingWithTheReducedResult
PAR
nextParallelOp1
nextParallelOp2
etc.(The "basic" processes are just the send/receive of a value over a channel. By the way, Go channels are a direct lift from CSP!)
Also, CSP genuine inter-process communication primitive is a _multi-way_ rendezvous which can synchronise an arbitrary number of process, possibly more than two. This is called "interactions" in chapter 2 of Tony Hoare CSP book [1], and channels are built on top of these interactions in chapter 4.
This multi-way synchronization is also present e.g. in the LOTOS specification language, which is an ISO standard. The CADP [2] verification toolbox offers various tools like model-checker to verify LOTOS programs, and also to generate executables. For those who know how special/weird the LOTOS syntax is, the CADP folks also develops the LNT language which looks much more like Ada/Pascal.
[1] http://usingcsp.com/cspbook.pdf [2] http://cadp.inria.fr/
The FAQ even mentions Occam, mentioned by another commenter:
"Occam and Erlang are two well known languages that stem from CSP. Go's concurrency primitives derive from a different part of the family tree whose main contribution is the powerful notion of channels as first class objects."
* Does anything else like this currently exist (other than the Trio library he mentions), which shows that it's a superior paradigm in practice?
* What are the cons to this approach? Why not do it?
I only see his nursery as being useful when you really want your async tasks to complete before the function in which they were dispatched returns. That's far from covering every use case of concurrency!
A lot of the value of concurrency is in background operations. These simply can't be tied to the duration of a function call on the dispatch thread. Doing so will literally kill any advantages of concurrency in the first place, might as well just block directly. (This is especially true of apps modelled as an update loop, you definitely don't want to block that loop.)
I can think of very few places where I'd actually want a nursery, and even in those cases I'd rather use the promises or fork&join already available.
His section "There is an escape" answers this criticism: "The nursery object also gives us an escape hatch. What if you really do need to write a function that spawns a background task, where the background task outlives the function itself? Easy: pass the function a nursery object."
He goes on to explain a bit more, but one of his points is what really ties this abstraction back into the goto discussion: "Since nursery objects have to be passed around explicitly, you can immediately identify which functions violate normal flow control by looking at their call sites, so local reasoning is still possible."
> A lot of the value of concurrency is in background operations. These simply can't be tied to the duration of a function call on the dispatch thread.
I think they can: applications with background threads would have an outer-level nursery at whatever "main" is for them, and that nursery would be passed into whatever function needs to spawn a background thread.
If your background operation has those characteristics, maybe it is better to spawn a new process for it. Why a thread?
Perhaps, but only having a complex solution that handles 100% of use cases is less desirable than having a simple one that handles 80% of the most common use cases PLUS the ability to go deeper and use the complex method (if, and only if, it's necessary...)
Erlang's supervisor behaviours are basically that. They add some more stuff (the actual supervision) but fundamentally every process in a supervision tree will necessarily manage and outlive all its children processes.
Haskell's async[1] solution is very nice too, with the exception restarting processes and fault tolerance isn't really there yet. On the other hand, in Haskell there's stuff like STM, which makes atomic updates to shared memory easy.
At the lower level side of things Rust's model is very nice too. You can statically verify that you don't have certain classes of bugs.
I'd be interested in implementing this in Go and throwing a web server implementation at it to see if it makes more sense. Though maybe that's too simple a use-case for it.
It'd be interesting to see how to implement the error propagation using channels, without having any control over the goroutine passed in. If the goroutine panics, how to capture that and pass it back to the nursery?
1) You should be able to declare scoped blocks that mandate execution of all tasks started in that block ends when the scope ends.
2) This is fundamentally superior to all other forms of concurrency.
I get it; this is basically what async/await gets you, but conceptually you can spawn parallel tasks inside an awaited block, and know, absolutely that all of those tasks are resolved when the await resolves.
(this is distinct from a normal awaited block which will execute tasks inside it sequentially, awaiting each one in turn).
...seems like an interesting (and novel) idea to me, but I flat our reject (2) as ridiculous.
Parallel programming is hard, but the approach from rust, to give you formal verification, instead of arbitrarily throwing away useful tools seems much more realistic to me.
Take it back to the goto analogy: can you formally verify goto? Yes. Does that mean it's good to include as a language primitive? No.
You are basically taking an entire argument, ignoring its merits, and saying "but you can do it another way". You are exactly right, but you haven't rebutted the fact that structured concurrency is philosophically superior.
I love rust and the community; they will get to the truth of this argument eventually. But I suspect the truth is that "njs was right", and I hope the it's sooner rather than later.
> But I suspect the truth is that "njs was right", and I hope the it's sooner rather than later.
I'm not sure. The problem can and has been solved without the manually "pass the nursery object around" ""escape hatch"" for the general case of threads (because no: having the execution of spawning functions delayed by the lifetime of thread is arguably reasonable is some cases, but certainly not the general case of what threads are useful and used for)
It is still useful for tons of existing languages as a pattern anyway, but only if the use cases are suitable.
But the author is focused on a narrow use case of threads (and a narrow subset of the problems they introduce), and present their solution as a general truth and new fundamental control structure of computing, independent of already existing, in production, and arguably better solutions; and independent of analyzing the new problems their silver bullet introduces.
> Then our guarantee is lost: the operations that look like they're inside the with block might actually keep running after the with block ends, and then crash because the file gets closed while they're still using it. And again, you can't tell from local inspection; to know if this is happening you have to go read the source code to all the functions called inside the ... code.
But actually you can "tell", or even better have a type system good enough to prevent such mistakes entirely, and the author even knows a bit about Rust, yet fails to state that at least that item is a solved problem there (by using a different and arguably more general approach).
Now I don't know enough to decide whether something is missing on the panicking background thread front, but if it does that seems very solvable.
I don't buy that spawning threads (or even moral equivalents) and multiprogrammation is in a situation similar to unstructured use of goto anyway. You have tons of problems applicable to one and not the other. And of course resource management is hard to get right with threads. But we have at least a production example of a language that get it right on some points by leveraging more general ideas (and the other difficult points are mostly not addressed by the nursery idea, anyway).
However, I agree with you that there's no one-size-fits-all approach to concurrency. Sometimes you want long-lived tasks that communicate—the actor model, in other words—in which case fork/join doesn't buy you much. I do think that fork/join is frequently what you want, though.
I believe that python's `async for` construct (which may be stolen from c#, but I'm not sure) does this, and another user mentioned Promise.all which in javascript I believe allows the child promises to resolve concurrently.
In both cases, these are for concurrency, not parallelism.
I don't believe either of the constructs you've mentioned do this, and I'm not aware of any that do.
The trivial counter example would be a deeply nested `setTimeout(..., 5000)` inside the javascript code.
It doesn't matter if you've called Promise.all or not.
Also, regarding parallelism: Obviously async/await are for that; this is basically pitching an equivalent construct for parallel processing.
I think this is novel, frankly, and the alternatives being thrown around by people are by people who didn't read the article.
> The unbridled use of the go to statement has an immediate consequence that it becomes terribly hard to find a meaningful set of coordinates in which to describe the process progress. Usually, people take into account as well the values of some well chosen variables, but this is out of the question because it is relative to the progress that the meaning of these values is to be understood! With the go to statement one can, of course, still describe the progress uniquely by a counter counting the number of actions performed since program start (viz. a kind of normalized clock). The difficulty is that such a coordinate, although unique, is utterly unhelpful. In such a coordinate system it becomes an extremely complicated affair to define all those points of progress where, say, n equals the number of persons in the room minus one!
[0] http://www.u.arizona.edu/~rubinson/copyright_violations/Go_T...
Most code out there is fine using break and continue in loops, and early returns are pretty popular. All of these are unstructured programming (by the 60s definition). I don't think it's all that clear that it has won.
C/POSIX type threads have no language support for indicating what data is shared and which locks protect which data. That's a common cause of trouble. The big question in shared memory concurrency is "who locks what". Most of the bugs in concurrent programs come from ambiguities over that question.
Early attempts to deal with this at the language level included Modula's "monitors", the "rendezvous" in Ada, and Java "synchronized" classes. These all bound the data and its lock together. Rust's locking system does this, and is probably the most successful one so far. (Yes, the functional crowd has their own approaches.)
Go talked a lot about controlling shared memory use. The trouble with goroutines, as Go programmers found out the hard way, was that the "share by communicating, not by sharing" line was bogus. Even the original examples had shared data. But the language didn't provide much support for controlling that sharing.
Python is basically at the C level of sharing control over data, except that the Global Interpreter Lock keeps the low-level data structures from breaking. This prevents Python programs from doing much with multi-core CPUs. Since this is just another thread library for Python, it has the same limitations.
Real concurrency in Python with disjoint data, and without launching a heavy-weight subprocess, would be a big win. But this isn't it.
Now what you could do is break objects down into annotated types. Consider immutable vs. mutable in combination with thread-unsafe, thread-compatible, and thread-safe. Immutable data that's not thread-unsafe you can share freely across threads, all is well. L2/L3 caches are happy. Mutable that's thread-safe can similarly be shared at will. Then you can force that thread-compatible objects be wrapped & accessed only from a Mutex or transfered between threads as part of a move operation.
Rust gives you the tools to do all of this, and indeed does some of it, but as part of the steep learning curve of the ownership model.
One question though. The first part of the article says that "onclick" handlers should be replaced with nurseries as well. But I don't see how. Can someone explain?
I see fundamental issues with it: in some cases the checking model proposed by Rust is better; also - and this is related -, your don't always fix things reliably by mindlessly extending lifetimes or delaying things until termination of others, in the same way that mindlessly switching a resource usage to a shared_ptr in C++ if you had a lifetime issue can't be done in the general case, because you could very well only be trading a bug for another. Checking capabilities are more useful and general than constructive limitations, especially when we have load of counter examples on use cases.
So without hesitation: yes, this is more structured than having no structure on the point considered, but that is not at all a sufficient criteria to make that the kind of panacea the author seems to think it is. I would have been way more positive in seeing that presented as a comparison with the other existing solutions, similar or not, and without that little escape hatch story that makes me thing the author has found a hammer and now everything looks like a nail to them.
So how is creating a nursery that matches the lifetime and visibility of the page different from not using a nursery at all?
I'm intrigued by libdill but it's mysterious enough that I'm scared to include it in my project -- I don't want to risk getting sidetracked by having to debug my concurrency primitives.
Very excited!
In C# and Python, an async context shares a single thread loop that all joins jump back to. Simply starting a thread and waiting on it isn't suitable in most scenarios, namely web and UI dev.
I wish Go had something like this as well.
If you just skimmed, this is actually worth a careful read. The parallels between "go" and "goto" are explained very clearly, and you get some awesome Dijkstra quotes to boot!
Author is a PhD student, which bodes well for not reinventing wheels dumbly. Therefore, I look forward to the lit review of other concurrency & parallelism work through the last 40 years, which this writeup notably lacks (author mentions his stack of papers to review).
Your ideas are intriguing to me and I wish to subscribe to your newsletter.
ed: author has phd, not is a student.
Except that's exactly what the author did. They just reinvented scoped threadpools.
I have no direct practical knowledge of Golang, but working on a large application that used BlockingQueue for concurrent communication and one which extensively used services buses for communication - both were hard to understand and reason about flow.
After some years with Scala Futures I'd say they work well and reason well. They can be seen as normal function calls returning Future instead of another 'container'.
They reflect the black box mentioned in the article, with one way in and one way out (e.g. when a method returns Future[_]).
The point about error handling: We use Option,Seq.empty on read error handling, Validation on create/write and Either on side effects (like sending mail).
(yes, they are still leaky abstractions e.g. when debugging, but work fine most of the time)
As others have mentioned, reusing the "await" keyword could cover a lot of these nursery scenarios.
And this is a valid approach. But pretending that this should be the only one and that this is in a way similar to unstructured goto vs structured programming? I'm not buying it. Because there will be long lived global nurseries floating around in big enough codebases, effectively eliminating all the guarantees they are supposed to provide for the affected threads. I mean; I'm not sure they can even guarantee the advantages they are supposed to provide (in the sense of providing new easy to check properties, with actual tools existing capable of checking them).
Don't get me wrong. I find the approach interesting, and will happily use it where applicable, but just the comparison to goto does not really makes sense, nor does the fiction that threads are best modeled by always being contained into managing function calls (hmf, except when they are not...). The "escape hatch" is so big that it just plain devalues the solution compared to not having it (or having it only in vastly more constrained ways) and then obviously not pretending this is what should replace traditional spawning (and even more) everywhere.
You ought to look into Erlang and Elixir on the BEAM vm/runtime. It's arguably the best example of this kind of concurrency (greenthreading, async) done properly with regards to error handling.
I don't write Elixir or Erlang, but I believe this process is managed by the supervisor. You can select various behaviours for when a process crashes or errors out[1]. For instance, you can have a process simply restart after it crashes. Combined with a fail-fast mentality, this produces remarkably fault tolerant and long lived applications.
Supervision trees (or sync.WaitGroup in Go) are good tools for achieving the same end result. But it isn't as semantically protected as the article states. In Trio, the property is given by the programs scope.
However, if process creation/termination follows the scoping rules of the program, I have a hunch you run into situations where certain things are not only hard, but outright impossible to express.
Now, the author gambles that this is a good thing and we will eventually find good structural solutions to all the problems. I, on the other hand, is a bit more pessimistic because it has been tried before and found to be lacking.
I wonder how Trio handles error propagation.
Really what TFA’s discussing seems much more akin to OTP than raw Erlang. Or more specifically a subset of it. Occasionally I wish there were a few more OTP supervisor behaviors but nothing that’s a show stopper. Not familiar with Go’s WaitGroup but I haven’t seen it used much in code I’ve read.
You could achieve something similar in JavaScript with Promise.all() and await:
await Promise.all([
asyncFunc1(),
asyncFunc2(),
asyncFunc3()
])
Of course, that's not language-level and the point seemed to focus more on eliminating traditional branching than just adding another way to do it. const promises = files.map(readFileAsync); // "nursery"
const fileContents = await Promise.all(promises); // "with"
I generally agree with his premise that it sucks having to figure out if a function is concurrent or not; i.e., does it return a value or a Promise/Future. I'm not sure if his solution solves that particular issue though, unless it's handled automatically in his "nursery.start_soon" function.And that's for a very well written post, that tries to address all common issues.
And yet, people manage to get it wrong, or write facile responses like "re-implementing the fork/join".
Not to mention missing the whole nuance of what the author is talking about, which is not about novelty of a feature, but about what it allows us (and even more so, what it constraints us).
It's like as if people being shown for loops and structured programming in the 60s responded with "this proposal just reinvents gotos". Or worse, that "this is more restrictive that gotos".
Yes, the author knows about the Erlang's model. He writes about it in the post, and about how you can use his proposal to do something similar.
Yes, the author knows about Rust's model. In fact Graydon Hoare, the creator of Rust (now working at Apple on Swift), has read the post's initial draft and gave his comments to the author.
http://kimundi.github.io/scoped-threadpool-rs/scoped_threadp...
EDIT: He compares to Rust in a comment on the Reddit thread:
https://www.reddit.com/r/programming/comments/8es8x3/notes_o...
As an aside, HN users tend to sneer at Reddit, but this is another case where the discussion on Reddit is better than the one here.
The simple fact that Clojure can implement goroutines as a library shows how flexible the language is. Then there's immutability and a strong focus on simplicity among others.
It is well organized, written and specific about it's claims.
Want a web app that sets up some long-running thing to run in the background while the request returns quickly? Well then you're going to need a nursery above the level of the request which is still available to every request. I don't see what that gives you above conventional threading. Oh, and you'd also need to implement your own runner in a thread to have a task failure not bring down the whole application.
Yes, the same thing is true of goto and other control flow patterns. Patterns change to match the tools available.
An implementation of this pattern could handle the spawning of functions however it wants and the language supports (as independent processes, threads, with an event loop, etc.).
My project is currently struggling with how to migrate to Python async. The biggest challenge is the place where async and sync interface.
Just the other day, my colleague was wondering out loud about the possibility of using a context manager to constrain the scope of async. This is it. This is exactly what we were looking for.
This is a really useful property to have and reason about.
Instead of several independent coroutines with arbitrarily overlapping lifetimes, we can now think of all coroutines as organized in a single hierarchy with properly nested lifetimes.
The function call stack becomes a call tree - each branch is a concurrent execution.
You're free to spawn parallel tasks in async/await -- you just use Promise.all or Future.sequence, or whatever your language provides to compose them into a larger awaitable.
Nurseries seem to go a step beyond this by reifying the scheduling scope as the eponymous nursery object. This means that you have a new choice when the continuation of your async task happens: a nursery pass in from some ancestor of the call tree. My gut says that this offers similar power as problematic fire-and-forget async tasks, but takes away the ability to truly forget about them.
My guess is that, in practice, you end up with some root level nursery in your call stack to account for this. But account for it you must! And while the overwhelming sentiment in these comments is pretty dismissive, I'd caution against downplaying the significance of this. It's basically like checked exceptions or monadic error handling.
I also think about how this maps to task or IO monad models of concurrency. It seems like there's an inversion of control. Rather than returning the reification of a task to be scheduled later, the task takes the reification of a runtime, upon which to schedule itself. I'm not sure what the ramifications of this are. Maybe it would help with the virality of async return values [1], but at the cost of the virality of nursery arguments.
Lastly, one thing this article nails is the power of being able to reason about continuation of control flow. Whether or not nurseries have merit as a novel construct, this article still has a lot of educational use by making this argument very clearly. Even if the author is wrong about nurseries being "the best", it sets a compelling standard that all control mechanisms--async or not--should have to explain themselves against.
I do have a couple questions:
- In the real world, would library APIs begin to get clogged with the need for a nursery on which to run an async task? Think async logging or analytics libraries.
- Would usages similar to long-running tasks that receive async messages be compatible? I'm thinking of usages of the actor model or channel model that implement dynamic work queues.
- Does this increase the hazard presented by non-halting async tasks?
[1] http://journal.stuffwithstuff.com/2015/02/01/what-color-is-y...
launch_thread(function, perhaps, some, initial, data);
The trouble with this approach to concurrency is twofold:(0) It forces a hierarchical structure where one continuation of the branching point is deemed the “parent” and the others are deemed the “children”. In particular, if the forking procedure was called by another, only the “parent” continuation may return to the caller. This is unnatural and unnecessarily limiting. Even if you have valid reasons to guarantee that only one continuation will yield control back to the caller (e.g., to enforce linear usage of the caller's resources), the responsibility to yield back to the caller is in itself as a resource like any other, whose usage can be “negotiated” between the continuations.
(1) It brings the complication of first-class procedures when it is often not needed. From a low-level, operational point of view, all you need is the ability to jump to two (or more) places at once, i.e., a multigoto. There is no reason to require each continuation to have a separate lexical scope, which, in my example above, one has to work around by passing “perhaps some local data” to `launch_Thread`. There is also no reason to make “children” continuations first-class objects. If you need to pass around the procedure used to launch a thread between very remote parts of your program, chances are your program's design is completely broken anyway. These things distract the programmer from the central problem in concurrent programming, namely, how to coordinate resource usage by continuations.
That said, the article has good technical content. It proposed a new concurrency library with interesting properties. Concurrency comes with additional cost. The library proposes a paradigm to minimize certain costs and should provide punchy examples of how things can be done simply and efficiently with it.
But instead it is picking shallow fights with the go statement (does the author know about the "sync" package and WaitGroup)? Overall I found the advocacy section WAY too long. Use most of that real estate to show goodness of your library, not on trying to punch holes in the competitors. My 2c.
A question I had was with the API that's been chosen. The `nursery` is chosen as the reified object, and a function `start_soon` is exposed on it. Perhaps in other parts of the library there are other methods exposed on `nursery`? If not, in some languages it seems like the `start_soon` method itself would make more sense as the thing to expose. In use, it might do like this:
...
nusery {
(go) in
go { this_runs_concurrently_in_the_nursery }
go { this_also_runs_concurrently_in_nursery }
// make a regular function call passing it the nursery's `go`
some_func(go)
}
...
And elsewhere: ...
func some_func(go) {
do_something()
go {
nursery {
// This nursery is within the outer one.
(go2) in
go2 { do_stuff }
go2 { do_more_stuff }
}
}
}
...This reminds of me "colored functions" (red vs blue) where it becomes imperative to know if a function you are calling returns a value or a Future/Promise.
Some languages allow annotating a function to indicate as such so the IDE can help. His particular solution he presents actually doesn't address this question: Is your function sync or async? You still have to know when calling a function if it's async and needs to be in a nursery or not.
Should a programming language abstract away whether a function is async or not? async/await is a step forward (C#/JS) but it still requires knowing if the child function is async or not.
Let me try with something else.
Imagine you have try/catch/finally BUT NOT AS CONTROL FLOW CONSTRUCS but "just api calls".
So, you language need to be used like:
foo()
exceptions.try{
bar()
this.catch{
}
}
It means, you need to remember to ALWAYS REMEBER to "close" the start of the call.Imagine how bad this could be. If only "try/catch" was as with "IF/ELSE/ENDIF" so you not do something stupid like:
foo()
exceptions.catch{
what?()
}
bar()https://wiki.rice.edu/confluence/display/HABANERO/Habanero-J...
cobegin/coend are limited to properly nested graphs, however fork/join can express arbitrary functional parallelism (any process flow graph) [1]
Yes, for graceful error handling it needs to form some sort of process tree or ATC(Asynchronous transfer of control), which is implemented in Erlang/OTP and ada programming.
[1] http://www.ics.uci.edu/~dillenco/compsci143a/notes/ch02.pdf
However, the momentum of today's programming community may be too great to surmount. When goto was criticized, there were fewer people to convince to give up on it. Now, there are orders of magnitude more devs. And all of them are comfortable in the current way of doing things.
If you think reasoning about concurrency is hard, try testing and modelling it, especially for something distributed. This is where naive ideas about concurrency should start to fail and a need in solid foundation arise.