The basic API (see page 17) consisted of:
pid_t switchto_wait(timespec *timeout)
- Enter an 'unscheduled state', until our control is
reinitiated by another thread or external event (signal).
void switchto_resume(pid_t tid)
- Resume regular execution of tid
pid_t switchto_switch(pid_t tid)
- Synchronously transfer control to target sibling
thread, leaving the current thread unscheduled.
The key abstraction is switchto_switch, which explicitly transfers control to another thread, leaving the current thread quiescent. It's analogous to setcontext(3) or longjmp(3).I can write C code that calls Perl code that calls Java code that calls Python code quite readily, except if I want to use any of their concurrency abstractions like coroutines, async/await, futures, etc. But all of those abstractions merely box function invocation state--otherwise known as a call stack--for a single logically synchronous flow of execution. And we know that all those languages can easily share call stacks.
The issue is that operating systems don't provide a consistent contract for creating and transferring control to different call stacks. They provide OS threads but those are scheduled by the kernel and their implementation and behavior are unnecessarily intermingled with tangential aspects of the runtime environment. It would be like if every time you wanted to call a function you had to use some special kernel facility; not the kind of thing that promotes easy and performant interoperability among different languages or even different libraries.