Some thoughts on asynchronous Python API design in a post-async/await world (opens in new tab)

(vorpus.org)

301 pointspiotrjurkiewicz9y ago114 comments

114 comments

57 comments · 6 top-level

tschellenbach9y ago· 19 in thread

Are there any languages that have really nailed this? I've used gevent, eventlet, (both python), promises, callbacks (node) and none of them come close to being as productive as synchronous code.

I'd like to try out Akka and Elixer in the future.

retrogradeorbit9y ago

Erlang (and by extension Elixir and LFE) has "nailed" it by making the actor pattern first class. Go's channels are great, but Go itself is quite low level. Also you should checkout Clojure's core.async to see what improved channel constructs on top of a high level, lock-free, multithreaded language core looks like.

Part of the problem with Python ecosystem is the insular mind set of its proponents. Python fanboys have no interest in going and seeing whats on the other side. So the platform has become a bit of an echo chamber with Pythonistas declaring their clunky approaches the industry best.

You can see this by looking at how little love a CSP solution for python gets [https://github.com/futurecore/python-csp] verses the enormous buy-in it's more popular frameworks receive.

venantius9y ago

core.async is using locks under the hood - it's just hiding that from you as an implementation detail.

1 more reply

ezyang9y ago

I like to tell people that the killer app for Haskell is writing IO bound, asynchronous code. The secret weapon is do-notation, which lets you write code as if it were sequential, but have it desugar into what is (essentially) a series of chained callbacks.

I like to point at Facebook's use of Haskell as a good example of being successful in this space http://community.haskell.org/~simonmar/papers/haxl-icfp14.pd... It would be disingenuous to suggest that Haskell is good in all situations, but if there was one place where it should be used, this it.

quotemstr9y ago

Haskell is good for lots of things, but I don't see it being particularly powerful in this application. The IO monad and do-notation let you write sequential code. So does Python.

¯\_(ツ)_/¯

4 more replies

Matthias2479y ago

I've written quite a lot of concurrent code through the last years (network servers, protocol, ...) and overall I now like Go most.

The biggest reason for this is not that necessarily that I think it has absolutely the best concurrency model, but that it's the most consistent one. Nearly all libraries are written for the model, which means they assume multithreaded access, blocking IO (reads/writes) and no callbacks. As a result most libraries are interoperable without problems.

Erlang/Elixir should have similar properties - however I haven't used it.

Javascript has a similar property because at least everything assumes the singlethreaded environment and concurrency through callbacks (or abstraction of them like promises and async/await on promises). I also like the interoperability and predictability here. But sometimes nested callbacks (even with promises) lead to quite a big of ugly code. And calling "async methods" is not possible from "sync methods" without converting them to async first (which could mean some big refactoring). So I prefer the Go style in general.

The worst thing from my point of view are all the languages that do not have a standard concurrency model, e.g. C++, Java, C#, and according to this article also Python. Most of them have several libraries for (async) IO which can be beautiful by themselves but won't integrate into remaining parts of the application without lots of glue code. E.g. boost asio is nice, but you need a thread with an EventLoop. If your main thread is already built around QT/gtk you now need another thread and then have 2 eventloops which need to interact. Some question for Java frameworks, e.g. integrating a Netty EventLoop in another environment (Android, ...). In these languages we then often get libraries which are not generic for the whole language but specific to a parent IO library (works with asio, works with asyncio, ...) and thereby some fragmented ecosystems.

A standard question that also always arises in these "mixed-threaded" languages when you have an API which takes a callback is: From which thread will this callback be invoked? And if I cancel the operation from a thread, will it guarantee that the callback is not invoked. If you don't think about these you are often already in bug/race-condition land.

quotemstr9y ago

C++? Java? Python? The traditional thread model isn't bad merely because it's traditional. I much prefer it to promise hell and to async-everything. About the only thing that beats it is CSP, which you can also represent sequentially without funky new keywords and which you can implement as a library for C++, Java, or Python.

I never understood why people tout Go's goroutine feature so much. You can have it in literally any systems language.

reality_czech9y ago

The whole point of Golang is that every library and every project that uses Go will support coroutines and channels. Sure you can write a toy project in a language like C that has these concepts, but your toy library will effectively be usable with all of the other libraries that have ever been written for C. Any library that calls a blocking function will break your coroutine abstraction.

It's like saying that indoor plumbing is no big deal-- it's just liquid moving through a pipe. Well yes. Yes, it is. But if you don't have plumbing in your neighborhood, or a sewage treatment plant in your city, you can't fake it by fooling around in your garage. And frankly, it's not going to smell like a rose.

3 more replies

LukeShu9y ago

(I'm not terribly familiar with Python's threading, so I'm not going to talk about it)

I never understood why people tout Go's goroutine feature so much. You can have it in literally any systems language.

There are two big reasons for it.

Firstly, goroutines are extremely lightweight. "Traditional" threading in C, C++, and Java means native OS threads, which are comparatively expensive. Sure, fiber/coroutine libraries exist for these languages, but they are far from common (and, the only fiber library for Java that I know of, Quasar, came after Go).

Secondly, Go's ecosystem encourages CSP-style message-passing, rather than "traditional" memory-sharing. This is channels, not goroutines, but they make working with goroutines very nice. This is less concrete than the first reason; you certainly can implement message-passing in any of the other languages' threading styles. But empirically, it doesn't happen as often. A factor in this is also that, unfortunately, many CS curricula don't discuss CSP, which means that Go's use of this is the first exposure many programmers have to it.

1 more reply

lmm9y ago

I use Scala without Akka. Just straightforward Futures and for/yield. It's great: the distinction between "=" and "<-" is minimal overhead when writing, but enough to be visible when reading code. You have to learn the tools for gathering effects (e.g. "traverse"), but you only have to learn them once (and they're generic for any effect, rather than being specific to Future, you can use the exact same functions to do error handling, audit logs and the like).

mi100hael9y ago

After using Akka-HTTP, I never want to write a HTTP service with anything else.

1 more reply

jackweirdy9y ago

How do you define "Productive"?

Aside from that, personally I've used both Akka and plain Scala with Futures, as well as node with Promises, bare callbacks and async (though I've not tried fibers). I find Promises and Futures are the perfect balance between simplicity of use and the benefits of using the Async model. There's no need to reason about threads, as they abstract away the actual async implementation, and the interface they expose is very easy to reason about.

dhd4159y ago

I'm surprised there aren't more mentions of Tasks in C# or F# on the .NET platform as examples of asynchronicity done well.

From the perspective of uniformity and availability, while C# provided asynchronicity via callbacks before the introduction of Tasks in the 4.5 release of the .NET Framework, all the core libraries that used callback-style async (as well as some that had been strictly synchronous-only) were updated with Task-based overloads, so there are no problems with Task-based async being inconsistently available. Additionally, adoption of Task-based async in third-party libraries has been high, so it's relatively uncommon to encounter code that does not support it.

From the perspective of code productivity, it's hard to get much better than simply adding the async and await keywords where necessary. As a very simple example, consider a typical server application that receives requests via HTTP, processes them via an HTTP call to another service as well as a database call, and then returns an HTTP response. The sync code (blocking with a thread-per-request model) might look something like this:

    void handleRequest(HttpRequest request) {
        var serviceResult = makeServiceCallForRequest(request);
        var databaseResult = makeDatabaseCallForRequest(request);
        sendResponse(constructResponse(request, serviceResult, databaseResult));
    }

In order to make that same process async (non-blocking with a dynamically-sized thread pool handling all requests), the code would look like this:

    async Task handleRequestAsync(HttpRequest request) {
        var serviceResult = await makeServiceCallForRequestAsync(request);
        var databaseResult = await makeDatabaseCallForRequestAsync(request);
        await sendResponseAsync(constructResponse(request, serviceResult, databaseResult));
    }

It could even be taken one step further to make the service request and database call concurrently if there were no dependencies between the two which would reduce processing latency for individual requests:

    async Task handleRequestAsync(HttpRequest request) {
        var serviceResultTask = makeServiceCallForRequestAsync(request);
        var databaseResultTask = makeDatabaseCallForRequestAsync(request);
        await sendResponseAsync(constructResponse(request, await serviceResultTask, await databaseResultTask));
    }

I've added asynchronicity into a C# server application as above with substantial improvements in both individual request latency and overall scalability. I'm now working on a Java8 system and bemoaning the comparatively primitive and inconsistent async capabilities in Java8.

HeyImAlex9y ago

Writing concurrent code in go takes a lot less thinking than js. Or... a different kind of thinking? But holistically I greatly prefer it for complex asynchronous code.

Lack of generics on channels really hurts the library ecosystem though. Many things you need to write yourself.

daurnimator9y ago

Try lua with cqueues http://25thandclement.com/~william/projects/cqueues.html

qznc9y ago

Concurrent ML according to Andy Wingo. He recently wrote a good series on concurrency in programming languages: https://wingolog.org/tags/concurrency

RX149y ago

Seriously check out crystal. Go's goroutines seem to do quite well, and crystal is pretty close to go in terms of concurrency, but is a higher-level language overall.

1 more reply

mentat9y ago

goroutines with channels are well loved for concurrency in Go

junke9y ago

For Common Lisp, see lparallel and lfarm.

https://lparallel.org/overview/

xamlhacker9y ago

Try await/async in F#.

justinsaccount9y ago· 15 in thread

I feel like I'm too dumb to understand any of this. And I've been writing python for 12 years.

Just give me greenlets or whatever and let me run synchronous code concurrently.

  async def proxy(dest_host, dest_port, main_task, source_sock, addr):
    await main_task.cancel()
    dest_sock = await curio.open_connection(dest_host, dest_port)
    async with dest_sock:
      await copy_all(source_sock, dest_sock)

Are you kidding me? Simplified that is

  async def func():
    await f()
    dest_sock = await f()
    async with dest_sock:
      await f()

Every other token is async or await. No thank you.

jeswin9y ago

Are you saying using greenlets are any simpler than this? IMO that mechanism looks way more complex compared to this. And will probably be less efficient.

The point is this: threads are still expensive in bulk (the CPU has to shuffle a lot of data every time you switch). So all kernels have mechanisms to support parallel IO operations. An async library will use the best available kernel mechanism for IO; epoll on Linux, kqueue on BSDs, maybe IO Completion Ports on Windows (not sure). Turns out, doing that requires some help from the language itself or the code turns into a pyramidal mess. Async keyword addresses the readability aspect of code.

So:

a) It's more complex than synchronous code

b) But it solves the performance problem without too much cognitive overhead (once you get used to it).

quotemstr9y ago

> threads are still expensive in bulk

They don't have to be. First of all, even ordinary threads are more efficient than you might think. On a really awful low-end Android 4.1 device, I can pthread_create and pthread_join over 5,000 threads per second. On a real computer, my X1 Carbon Gen4, I can create and join over 110,000 threads per second. (And keep in mind that each create-join pair also forces two full context switches.)

For most applications, performance of regular threads is perfectly adequate. In these environments, the maintainability and debuggability advantages of using plain old boring threads makes it really hard to justify using something exotic.

But suppose you do have big performance requirements: you can still use normal-looking threaded code. There's a difference between how we represent threads in source code and how we implement them. It's possible to provide green, userspace-switched threads without requiring "await" and "async" keywords everywhere. GNU Pth did it a long time ago, and there are lots of other fibers implementations.

> the CPU has to shuffle a lot of data every time you switch

Any green-threaded system (with or without explicit preemption points) also does context switches! Such a system maintains in user space a queue of things to work on: as the system switches from one of these work items to another, it's switching contexts! You have the same kind of register reloading and cache coldness problems that switching thread contexts has. There's no particular reason that you can do it much better than the kernel can do it, especially since switching threads in the same address space is pretty efficient.

5 more replies

zzzeek9y ago

> And will probably be less efficient.

they're not. gevent (and threads) are way faster than explicit asyncio, as all of asyncio's keywords / yields each have their own overhead. Here's my benches (disclaimer: for the "yield from" version of asyncio). http://techspot.zzzeek.org/2015/02/15/asynchronous-python-an...

1 more reply

quotemstr9y ago

People keep inventing funky new ways of representing threads.

With or without async, we're writing threads. (Promise chains are _also_ threads, very awkwardly spelled.) Really, we're arguing over whether we want our preemption points to be explicit or implicit. I prefer implicit myself, because the implicit style leads to much clearer code.

I understand how the JavaScript people might be excited that they can finally have threads, even if ugly ones, but there's no reason to get the rest of the world to switch to explicit-preemption-point threads.

int_19h9y ago

> Really, we're arguing over whether we want our preemption points to be explicit or implicit.

It's not even that!

It's not like you actually get to decide where to await in async/await code - you have to await on any call that is async, if you expect to get the result.

Now, if the underlying framework uses hot tasks - meaning the async operation starts executing as soon as it's invoked, and not when the returned task is awaited (as in e.g. .NET/C#) - you can choose to omit async to, effectively, fork your async "thread". So NOT doing await on something is just a fork operation. It's the reverse from regular sync code, where thread forks are explicit, and sequential flow on a single thread is implicit.

One other case where you wouldn't await is when you need to await on a combination of any or all tasks at the same time (i.e., wait until all tasks complete, or wait until one of the tasks completes). But the first one is equivalent to a thread join in sync code, and the second to a condition variable. So, again, you get a case where something more explicit in sync code is more implicit in async code, and vice versa.

Now note that all this is solely about syntax! You can take the C# compiler, and change it so that every awaitable statement is automatically awaited, except when the newly introduced operator "taskof" is applied, in which case you get the raw future instead. Voila! Cooperative future-based multitasking with implicit preemption points. Yet it works exactly the same, and will even be able to call into and be called from any existing C# code compiled by the original compiler.

I suspect that this will be the next step after async/await, once enough people notice that the default (non-await) behavior is something that they need very rarely, and figure out that it's better to rather change the syntax so that the much more common thing (await) is implicit. Similar to how the use of =/== for assignment and comparison has won out over :=/= in imperative languages.

bufordsharkley9y ago

After watching (Curio creator) David Beazley's presentation from earlier this year on async/await[0], I feel I finally get it. Recommended watching.

[0] https://www.youtube.com/watch?v=E-1Y4kSsAFc

insertnickname9y ago

The amount of times Beazley says "insane", "nightmare", etc. in this talk makes me wary.

dismantlethesun9y ago

Welcome, the wonderful world of writing anything in Javascript.

Imagine the same thing using Promises:

   def proxy(dest_host, dest_port, main_task, source_sock, addr):
      main_task.cancel()\
          .then(lamdba _: curio.open_connection(dest_host, dest_port))\
          .then(lambda dest_sock: copy_all(source_sock, dest_sock)

noobiemcfoob9y ago

It's statements like this that have kept me from ever learning _anything_ in Javascript...

BonoboBoner9y ago

Well it is only useful when you really rely on asynchronous programming. Nobody states that every piece of code is supposed to be written like this. You should only use async/await when a thorough performance analysis shows that it is your bottleneck.

Think of handling a web request, where you have to do parallel I/O requests to subsystems like a database, a webservice, redis, and so on. I think async/await gives us a nice standard way of describing "hit me back once X is done".

foota9y ago

I don't think most code will be this dense with await.

sidlls9y ago

And Rust's developers think that 'unsafe' in third-party crates will be well-vetted and therefore actually "safe", most C developers don't think somebody will incorrectly free or screw with memory they've allocated and passed back to the caller, most C++ developers don't think anybody will (ab)use 'const_cast', and so on.

A lot of terrible bugs in code is caused by people making assumptions such as yours.

3 more replies

int_19h9y ago

I've been writing async/await code for the past 2.5 years, and no, it actually is typically this dense, if you count tokens (real identifiers are obviously longer, so it's not as bad character-wise, and awaits are not quite as prominent).

1 more reply

rudolf09y ago

I still use gevent any time I need async code. It's also easy to tack onto existing projects with its monkey patching. I've never seen a need to migrate away from gevent, even if it's inarguably a language hack.

imtringued9y ago

It's reasonable compared to the old way of having three layers of callbacks in Node.js.

quotemstr9y ago· 13 in thread

The idea espoused in this blog post, that

> if you have N logical threads concurrently executing a routine with Y yield points, then there are NY possible execution orders that you have to hold in your head

is actively harmful to software maintainability. Concurrency problems don't disappear when you make your yield points explicit.

Look: in traditional multi-threaded programs, we protect shared data using locks. If you avoid explicit locks and instead rely on complete knowledge of all yield points (i.e., all possible execution orders) to ensure that data races do not happen, then you've just created a ticking time-bomb: as soon as you add a new yield point, you invalidate your safety assumptions.

Traditional lock-based preemptive multi-threaded code isn't susceptible to this problem: it already embeds maximally pessimistic assumptions about execution order, so adding a new preemption point cannot hurt anything.

Of course, you can use mutexes with explicit yield points too, but nobody does: the perception is that cooperative multitasking (or promises or whatever) frees you from having to worry about all that hard, nasty multi-threaded stuff you hated in your CS classes. But you haven't really escaped. Those dining philosophers are still there, and now they're angry.

The article claims that yield-based programming is easier because the fewer the total number of yield points, the less mental state a programmer needs to maintain. I don't think this argument is correct: in lock-based programming, we need to keep _zero_ preemption points in mind, because we assume every instruction is a yield point. Instead of thinking about NY program interleavings, we think about how many locks we hold. I bet we have fewer locks than you have yields.

To put it another way, the composition properties of locks are much saner than the composition properties of safety-through-controlling-yield.

I believe that we got multithreaded programming basically right a long time ago, and that improvement now rests on approaches like reducing mutable shared state, automated thread-safety analysis, and software transactional memory. Encouraging developers to sprinkle "async" and "await" everywhere is a step backward in performance, readability, and robustness.

vomjom9y ago

It's not clear what you're suggesting as an alternative. My understanding is that you're suggesting thread-per-request, which has many known flaws. There are three approaches to serving requests:

1. Thread-per-request. This is a simple model. You have a fixed-size thread pool of size N, and once you hit that limit, you can't serve anymore requests. Thread-per-request has several sources of overhead, which is why people recommend against it: thread limits, per-thread stack memory usage, and context switching.

2. Coroutine style handling with cooperative scheduling at synchronization points (locks, I/O). This is how Go handles requests.

3. Asynchronous request handling. You still have a fixed-size thread pool handling requests, but you no longer limit the number of simultaneous requests with the size of that thread pool. There are several different styles of async request handling: callbacks, async/await, and futures.

#2 and #3 are more common these days because they don't suffer from the many drawbacks of the thread-per-request model, although both suffer from some understandability issues.

quotemstr9y ago

Those options aren't as distinct as you might imagine. Would calling it fiber-per-request make you happy?

(By the way: most of the time, a plain-old-boring thread-per-request is just fine, because most of the time, you're not writing high-scale software. If you have at most two dozen concurrent tasks, you're wasting your time worrying about the overhead of plain old pthread_t.)

I'm using a much more expansive definition of "thread" than you are. Sure, in the right situation, maybe M:N threading, or full green threads, or whatever is the right implementation strategy. There's no reason that green threading has to involve the use of explicit "async" and "await" keywords, and it's these keywords that I consider silly.

3 more replies

Matthias2479y ago

Don't be focused on "requests". Requests (where most people mean HTTP requests) are one layer where you need concurrency, but in principal you need it on multiple layers.

E.g. at first you have a server that accepts multiple connections and each must be handled -> Thread per connection or one thread for all connections? If you go for threads you might even need multiples, e.g. a reader thread, a writer thread which processes a write queue and a third one which maintains the state for the connection and coordinates reads and writes.

Then on a higher layer you might have multiple streams per connection (e.g. in HTTP/2), where you again have to decide how these should be represented.

Depending on the protocol and application there might be even more or other layers that need concurrency and synchronization.

But the general approaches that you mention do still apply here: You can either use a thread for each concurrent entity and using blocking operations. Or you can multiplex multiple concurrent entities on a single thread with async operations and callbacks. Coroutines are a mix which provide an API like the first approach with an implementation that looks more like the second approach.

guscost9y ago

> You have a fixed-size thread pool of size N, and once you hit that limit, you can't serve anymore requests.

If your application acts as a stateless proxy between client machines and your persistence layer, can't you just spin up another instance and load balance them at any time? It's not the most efficient solution at scale, but lots of people use this strategy.

dom09y ago

> There are three approaches to serving requests:

Lets not forget about forking servers. The kind where each request forks.

Arnt9y ago

I think the general idea is that if you make your yield points explicit, you see them yourself, and if you generally try for 100% unit test coverage, you end up nudging yourself towards fewer yield points, a simpler state machine, and better karma overall.

The complexity you see affects yourself more than the complexity you don't.

I rather agree.

FWIW my friend Abhijit Menon-Sen wrote a blog post on the matter last year, about some code with excellent test coverage and explicit yield points: http://toroid.org/callback-heaven

nhumrich9y ago

Your saying to say that async will still have race conditions, but I disagree with that premise. On async, context switching onlykhappens on an "await", which means race conditions won't happen. Race conditions typically happen when you have a shared global state and you read and write to it in two operations. For example a counter that you want to increment. In async its very rare you have a shared global state, but even if you do, reading and modifying the state is naturally right next to each other, not on opposite sides of the await.

jerf9y ago

"On async, context switching onlykhappens on an "await", which means race conditions won't happen. Race conditions typically happen when you have a shared global state and you read and write to it in two operations."

That may be when the "typically" happen, but you can still get race conditions where you have two tasks that can run next, and you accidentally write an assumption about which will happen into your code when there is no such assumption in the scheduler. You certainly will get fewer of these with async/await than with pure event-handling-style code, because async/await carries more information about proper ordering of code, but yes, you can still get things that are correctly described as "race conditions".

1 more reply

lmm9y ago

> Look: in traditional multi-threaded programs, we protect shared data using locks. If you avoid explicit locks and instead rely on complete knowledge of all yield points (i.e., all possible execution orders) to ensure that data races do not happen, then you've just created a ticking time-bomb: as soon as you add a new yield point, you invalidate your safety assumptions. > Traditional lock-based preemptive multi-threaded code isn't susceptible to this problem: it already embeds maximally pessimistic assumptions about execution order, so adding a new preemption point cannot hurt anything.

You get an equal and opposite problem: whenever you add one more lock, you invalidate your liveness assumptions.

> The article claims that yield-based programming is easier because the fewer the total number of yield points, the less mental state a programmer needs to maintain. I don't think this argument is correct: in lock-based programming, we need to keep _zero_ preemption points in mind, because we assume every instruction is a yield point. Instead of thinking about NY program interleavings, we think about how many locks we hold. I bet we have fewer locks than you have yields.

I'll take that bet. You really don't have to yield very often - only when making a network request, and perhaps not even for that in the case of a fast local network. Whereas you have to lock every piece of state that you have.

marcosdumay9y ago

> Whereas you have to lock every piece of state that you have.

You need to lock every piece of shared state you have. Where "shared" means stuff that many threads must communicate among themselves. One tends to keep the number of that kind of state low, really low. When zero is not possible, the most common number by a wide margin is one¹.

If you have more than 1, they are normally completely independent pieces of state that will not be used at the same time. If you have more than 1, and they are not independent, the code is either the result of at least one PHD thesis, or it does not work (or, often, both).

I bet you do network requests more than once on your code.

1 - The size of the shared state does not matter, so it's often one really big state.

IgorPartola9y ago

Every piece of shared state. In the case of something like a web server, you only need one lock: when the connection is being accepted and handed off to a worker thread. How many reads and writes does your web server perform?

kmike849y ago

In his talk about threads (https://www.youtube.com/watch?v=Bv25Dwe84g0) Raymond Hettinger made a point: when you have a codebase with many locks the result of composition is often a sequential program.

hueving9y ago

> as soon as you add a new yield point, you invalidate your safety assumptions.

While true, locks aren't free from this problem. They have the inverse. If someone adds code that accesses a data structure that should be protected by a lock and they forget to add the lock, you also lose all of your safety assumptions.

Animats9y ago· 4 in thread

The main use case for all this async stuff is handling a huge number of simultaneous stateful network connections. At least, that was what Twisted was used for. Are there other use cases for this sort of thing that justify all the complexity that comes with it?

int_19h9y ago

It lets you easily write responsive UI apps without worrying about things like threads - you treat your app as a single conceptual thread, and use async IO operations on it by awaiting them. Since in practice every operation callback is a new item posted onto the event loop, this doesn't block said loop at any point, and UI remains responsive. So the developer can think in simple terms like "if this button is clicked, [await] download this file, then update this label and [await] send this email", instead of background worker threads with condition variables etc.

In particular, WinRT heavily promotes this approach for UWP apps.

TimJYoung9y ago

My problem with this justification is that these problems have been solved for a long time with simple message passing. In Win32, you just post a message to a window handle from the background thread to notify the UI of status updates, etc. Yes, you do need to worry about shared state/locks if that "message" includes more than a simple integer. But, these are also solved problems and rarely require more exotic lock-less queues, stacks, etc. for the majority of applications that use these types of architectures for UI background processing because the performance implications are inconsequential. Using a shared stack that uses a simple critical section will work fine for managing the messages, especially since Windows now has critical sections that can use spin locks to help minimize context switches.

1 more reply

zzzeek9y ago

there's a very popular argument often made that if you dont use an explicit async-on-evented-IO approach for concurrency in general, and instead use threads or even an event-IO approach that conceals the IO wait just like a thread scheduler, then your program is impossible to reason about and will have bugs forever. Glyphs "Unyielding" at https://glyph.twistedmatrix.com/2014/02/unyielding.html is one well known blog post that describes this concept. For a lot of developers, the original reasons for event-based IO (e.g. many stateful network connections) is lost. The popularity of Javascript, which in recent years has exploded as a server side platform as well, is a key driver in this trend.

Animats9y ago

Part of the problem is that object-oriented programming is now out of fashion. If objects only allow one active thread inside the object at a time, you have a conceptual model of how to deal with concurrency. Rust takes this route, and Java has "synchronized". It's done formally, with object invariants, in Spec#. Objects in C++ are often used this way in multi-thread programs.

If you don't have some organized way of managing concurrency, you're going to have problems. Without OOP, what? "Critical sections" lock relative to the code, not the data. "Which lock covers what data?" is a big issue, and the cause of many race conditions.

(The dislike of OOP seems to stem from the problems of getting objects into and out of databases in web services. One anti-OOP article suggests stored procedures as an alternative. Many database-oriented programs effectively use the database as their concurrency management tool. Nothing wrong with that, but it doesn't help if your problem isn't database driven.)

Python has the threading model of C - no language constructs for threads. It's all done in libraries. There's no protection against race conditions in user code. The underlying memory model is protected, by making operations that could break the memory model atomic, but that's all. CPython also has some major thread performance problems due to the Global Interpreter Lock. Having more CPUs doesn't speed things up; it makes programs slower, due to lock contention inefficiencies. So the use of real threads is discouraged in Python.

There's a suggested workaround with the "multiprocessing" module. This creates ultra-heavyweight threads, with a process for each thread, and talks to them with inefficient message passing. It's used mostly to run other programs from Python programs, and doesn't scale well.

So Python needed something to be competitive. There are armies of Javascript programmers with no experience in locking, but familiarity with a callback model. This seems to be the source of the push to put it in Python. Like many language retrofits, it's painful.

Does this imply that the major libraries will all have to be overhauled to make them async-compatible?

1 more reply

codethief9y ago

I find it surprising that noone here comments on the actual topic of the blog post: Namely that the internal implementation of asyncio is opaque at best and this unfortunately propagates upwards to the public API. Personally, I have taken a look at its source code a few times as well (to understand what my code was doing because the docs were lacking details) and I remember that the callback hell paired with those additional user-space buffers the author mentions really made it a major PITA to reason about. Now, why should anyone worry about asyncio's internals? Heck, if everything was working, I wouldn't mind, either. However, as pointed out in the blog, there are quite a few edge cases where it isn't. Plus, the documentation traditionally doesn't do a particularly good job at explaining things. Or is it just the fact that the API is quite confusing sometimes that has caused me to take a look at the source code more often than I care to admit? (Compare https://news.ycombinator.com/item?id=12829759) Whatever it is, the fact is that asyncio's internals do matter unfortunately.

…which is why I was happy to hear that not all hope is lost and that someone created an alternative. Now, I haven't taken a look at curio yet, so maybe I'm a bit quick to judge, but I already found it very refreshing that spending not even a minute to read the documentation already left me with a good idea of how it works and how I can use it. Kudos to the author(s), I will definitely give it a try!

systems9y ago

i think because it is really actually new i think perl 6 is only being considered as production worthy since 2016

also as of now, most people who used it complain it is slow

give it 2 more years before you worry, and for now continue with python or whatever you like to use

no one is in a rush to make perl 6 popular ... it is not a commercial project ... so don't bet your career on perl 6 ... yet

j / k navigate · click thread line to collapse

114 comments

57 comments · 6 top-level

tschellenbach9y ago· 19 in thread

Are there any languages that have really nailed this? I've used gevent, eventlet, (both python), promises, callbacks (node) and none of them come close to being as productive as synchronous code.

I'd like to try out Akka and Elixer in the future.

retrogradeorbit9y ago

You can see this by looking at how little love a CSP solution for python gets [https://github.com/futurecore/python-csp] verses the enormous buy-in it's more popular frameworks receive.

venantius9y ago

core.async is using locks under the hood - it's just hiding that from you as an implementation detail.

1 more reply

ezyang9y ago

quotemstr9y ago

Haskell is good for lots of things, but I don't see it being particularly powerful in this application. The IO monad and do-notation let you write sequential code. So does Python.

¯\_(ツ)_/¯

4 more replies

Matthias2479y ago

I've written quite a lot of concurrent code through the last years (network servers, protocol, ...) and overall I now like Go most.

Erlang/Elixir should have similar properties - however I haven't used it.

quotemstr9y ago

I never understood why people tout Go's goroutine feature so much. You can have it in literally any systems language.

reality_czech9y ago

3 more replies

LukeShu9y ago

(I'm not terribly familiar with Python's threading, so I'm not going to talk about it)

I never understood why people tout Go's goroutine feature so much. You can have it in literally any systems language.

There are two big reasons for it.

1 more reply

lmm9y ago

mi100hael9y ago

After using Akka-HTTP, I never want to write a HTTP service with anything else.

1 more reply

jackweirdy9y ago

How do you define "Productive"?

dhd4159y ago

I'm surprised there aren't more mentions of Tasks in C# or F# on the .NET platform as examples of asynchronicity done well.

    void handleRequest(HttpRequest request) {
        var serviceResult = makeServiceCallForRequest(request);
        var databaseResult = makeDatabaseCallForRequest(request);
        sendResponse(constructResponse(request, serviceResult, databaseResult));
    }

In order to make that same process async (non-blocking with a dynamically-sized thread pool handling all requests), the code would look like this:

    async Task handleRequestAsync(HttpRequest request) {
        var serviceResult = await makeServiceCallForRequestAsync(request);
        var databaseResult = await makeDatabaseCallForRequestAsync(request);
        await sendResponseAsync(constructResponse(request, serviceResult, databaseResult));
    }

    async Task handleRequestAsync(HttpRequest request) {
        var serviceResultTask = makeServiceCallForRequestAsync(request);
        var databaseResultTask = makeDatabaseCallForRequestAsync(request);
        await sendResponseAsync(constructResponse(request, await serviceResultTask, await databaseResultTask));
    }

HeyImAlex9y ago

Writing concurrent code in go takes a lot less thinking than js. Or... a different kind of thinking? But holistically I greatly prefer it for complex asynchronous code.

Lack of generics on channels really hurts the library ecosystem though. Many things you need to write yourself.

daurnimator9y ago

Try lua with cqueues http://25thandclement.com/~william/projects/cqueues.html

qznc9y ago

Concurrent ML according to Andy Wingo. He recently wrote a good series on concurrency in programming languages: https://wingolog.org/tags/concurrency

RX149y ago

Seriously check out crystal. Go's goroutines seem to do quite well, and crystal is pretty close to go in terms of concurrency, but is a higher-level language overall.

1 more reply

mentat9y ago

goroutines with channels are well loved for concurrency in Go

junke9y ago

For Common Lisp, see lparallel and lfarm.

https://lparallel.org/overview/

xamlhacker9y ago

Try await/async in F#.

justinsaccount9y ago· 15 in thread

I feel like I'm too dumb to understand any of this. And I've been writing python for 12 years.

Just give me greenlets or whatever and let me run synchronous code concurrently.

  async def proxy(dest_host, dest_port, main_task, source_sock, addr):
    await main_task.cancel()
    dest_sock = await curio.open_connection(dest_host, dest_port)
    async with dest_sock:
      await copy_all(source_sock, dest_sock)

Are you kidding me? Simplified that is

  async def func():
    await f()
    dest_sock = await f()
    async with dest_sock:
      await f()

Every other token is async or await. No thank you.

jeswin9y ago

Are you saying using greenlets are any simpler than this? IMO that mechanism looks way more complex compared to this. And will probably be less efficient.

So:

a) It's more complex than synchronous code

b) But it solves the performance problem without too much cognitive overhead (once you get used to it).

quotemstr9y ago

> threads are still expensive in bulk

> the CPU has to shuffle a lot of data every time you switch

5 more replies

zzzeek9y ago

> And will probably be less efficient.

1 more reply

quotemstr9y ago

People keep inventing funky new ways of representing threads.

int_19h9y ago

> Really, we're arguing over whether we want our preemption points to be explicit or implicit.

It's not even that!

It's not like you actually get to decide where to await in async/await code - you have to await on any call that is async, if you expect to get the result.

bufordsharkley9y ago

After watching (Curio creator) David Beazley's presentation from earlier this year on async/await[0], I feel I finally get it. Recommended watching.

[0] https://www.youtube.com/watch?v=E-1Y4kSsAFc

insertnickname9y ago

The amount of times Beazley says "insane", "nightmare", etc. in this talk makes me wary.

dismantlethesun9y ago

Welcome, the wonderful world of writing anything in Javascript.

Imagine the same thing using Promises:

   def proxy(dest_host, dest_port, main_task, source_sock, addr):
      main_task.cancel()\
          .then(lamdba _: curio.open_connection(dest_host, dest_port))\
          .then(lambda dest_sock: copy_all(source_sock, dest_sock)

noobiemcfoob9y ago

It's statements like this that have kept me from ever learning _anything_ in Javascript...

BonoboBoner9y ago

foota9y ago

I don't think most code will be this dense with await.

sidlls9y ago

A lot of terrible bugs in code is caused by people making assumptions such as yours.

3 more replies

int_19h9y ago

1 more reply

rudolf09y ago

imtringued9y ago

It's reasonable compared to the old way of having three layers of callbacks in Node.js.

quotemstr9y ago· 13 in thread

The idea espoused in this blog post, that

> if you have N logical threads concurrently executing a routine with Y yield points, then there are NY possible execution orders that you have to hold in your head

is actively harmful to software maintainability. Concurrency problems don't disappear when you make your yield points explicit.

To put it another way, the composition properties of locks are much saner than the composition properties of safety-through-controlling-yield.

vomjom9y ago

It's not clear what you're suggesting as an alternative. My understanding is that you're suggesting thread-per-request, which has many known flaws. There are three approaches to serving requests:

2. Coroutine style handling with cooperative scheduling at synchronization points (locks, I/O). This is how Go handles requests.

#2 and #3 are more common these days because they don't suffer from the many drawbacks of the thread-per-request model, although both suffer from some understandability issues.

quotemstr9y ago

Those options aren't as distinct as you might imagine. Would calling it fiber-per-request make you happy?

3 more replies

Matthias2479y ago

Don't be focused on "requests". Requests (where most people mean HTTP requests) are one layer where you need concurrency, but in principal you need it on multiple layers.

Then on a higher layer you might have multiple streams per connection (e.g. in HTTP/2), where you again have to decide how these should be represented.

Depending on the protocol and application there might be even more or other layers that need concurrency and synchronization.

guscost9y ago

> You have a fixed-size thread pool of size N, and once you hit that limit, you can't serve anymore requests.

dom09y ago

> There are three approaches to serving requests:

Lets not forget about forking servers. The kind where each request forks.

Arnt9y ago

The complexity you see affects yourself more than the complexity you don't.

I rather agree.

FWIW my friend Abhijit Menon-Sen wrote a blog post on the matter last year, about some code with excellent test coverage and explicit yield points: http://toroid.org/callback-heaven

nhumrich9y ago

jerf9y ago

1 more reply

lmm9y ago

You get an equal and opposite problem: whenever you add one more lock, you invalidate your liveness assumptions.

marcosdumay9y ago

> Whereas you have to lock every piece of state that you have.

I bet you do network requests more than once on your code.

1 - The size of the shared state does not matter, so it's often one really big state.

IgorPartola9y ago

kmike849y ago

hueving9y ago

> as soon as you add a new yield point, you invalidate your safety assumptions.

Animats9y ago· 4 in thread

int_19h9y ago

In particular, WinRT heavily promotes this approach for UWP apps.

TimJYoung9y ago

1 more reply

zzzeek9y ago

Animats9y ago

Does this imply that the major libraries will all have to be overhauled to make them async-compatible?

1 more reply

codethief9y ago

systems9y ago

i think because it is really actually new i think perl 6 is only being considered as production worthy since 2016

also as of now, most people who used it complain it is slow

give it 2 more years before you worry, and for now continue with python or whatever you like to use

no one is in a rush to make perl 6 popular ... it is not a commercial project ... so don't bet your career on perl 6 ... yet

j / k navigate · click thread line to collapse