Unikernels: Rise of the Virtual Library Operating System (opens in new tab)

(queue.acm.org)

44 pointstizoc12y ago22 comments

22 comments

Is the idea that you'd compile a virtualized OS in with your application to produce one really streamlined VM appliance? I get that you'd be able to avoid the overhead of the OS in the VM but effectively making the VM a single application again.

Is this any better than the Docker method of reusing the same base-OS and compartmentalizing the applications? Is there that much to be gained in avoiding kernel/user-space transitions?

stormbrew12y ago

I think the real problem with this sort of idea is that, in the end, you're just reinventing processes. There's already a way to write an isolated single purpose application and run it on a server: fork() and then exec().

If you want to bring the isolation level of that process down to just absolutely what it needs to run we've got things like jails and cgroups. You could probably run a Go app with no access to the filesystem since everything is linked in statically anyways.

I think it misses the reasons people are excited about virtualization. Reproducibility and uniformity of environment has a higher value than isolation to most software developers. The priorities may be inverted on the sysadmin side, but I don't think so far as to justify this kind of approach.

michaelmior12y ago

My understanding is that the goal is not isolation, but performance. You can remove large chunks of the OS which you don't need. You also don't have any overhead from system calls since all code runs at the same privilege level. This is possible because (in theory) you can't execute arbitrary code. All the executable code is baked into the kernel at compile time and the page tables are sealed so no new code can be loaded.

You can achieve the isolation with jails and cgroups, but not the performance improvements.

stormbrew12y ago

As mentioned in a sibling you still have the hypercalls, and you definitely need those to still be present if you're running at ring 0 since, essentially, direct access to the hardware is probably an opportunity to attack the whole physical system (since hardware often has arbitrary bus access). Never mind the need to arbitrate access between multiple VMs.

And this is what I mean when I say that taken to its conclusion you're just reinventing processes.

I think this kind of performance claim needs to be solidly proven by something at least vaguely like a real running application to be taken as a given.

2 more replies

wmf12y ago

You've eliminated system calls but you have hypercalls; it's not clear whether this is faster than a container-based system that has system calls but no hypercalls.

1 more reply

hcarvalhoalves12y ago

> Reproducibility and uniformity of environment (...)

Shouldn't the unikernel approach actually improve reproducibility greatly? You build your application and all it's dependencies together, that should run exactly the same locally or on your Xen cloud.

avsm12y ago

That's exactly right. Since everything's a library, the normal OCaml dependency analysis pulls in a complete manifest of everything that goes into the final output. In the Xen output mode, this implies that the manifest contains everything. In the Unix backend, you still need to package up the kernel and library dependencies.

An example of this is the Mirage website itself, where all the kernel outputs that are live are stored in GitHub at https://github.com/mirage/mirage-www-deployment -- an explanation of the Travis CI workflow is at http://www.openmirage.org/wiki/deploying-via-ci

stormbrew12y ago

They're not on the same scale. It does improve both, but it improves isolation to a much much larger degree. It also makes the world inside significantly different from the world outside, so it takes a different knowledge base to be able to program effectively. Particularly if the inside is an OCaml program, tbh.

erichocean12y ago

Is there that much to be gained in avoiding kernel/user-space transitions?

Yes, there's at least an order-of-magnitude improvement in packet processing, for example, if you bypass the kernel.

Our mobile app runs WebSocket-like connections over UDP with libsodium for crypto, and we're moving our stack off of Node.js for exactly that reason.

foobarian12y ago

I don't get this. Maybe you can win big if you start with an inefficient system but a run of the mill Linux box has been able to saturate gigabit links without breaking a sweat for a decade if not more.

Things like file descriptor limits or numbers of connections are more of a pain in modern times, not necessarily a context-switch caused problem.

erichocean12y ago

Linux (and Unix in general) is "control plane" software. It was literally developed to replace the people that used to patch phone calls using physical cables.

That thing they were patching? That's the "data plane". It was incredibly high bandwidth relative to the control plane, since it was completely optimized for moving data.

Back to Unix. Unix is designed for the control plane. It is not designed for rapidly moving large amounts of data with maximum efficiency. Why do we use it today for what are arguably "data plane" tasks? Well, when all you have is a hammer...

Nowadays, you can have both on the same machine: run Linux on the first core or two, and reserve the remaining cores for your app. Along with huge page allocations to reduce TLB impact, you can literally own all CPU activity on those cores, and lock all of your RAM too.

That app is a normal Linux app, but it runs on the raw hardware—like not having an operating system at all. When you also give your app complete control over the network hardware, you've completely bypassed the kernel.

With UDP, you don't even need a networking stack, making this approach particularly attractive. Another poster mentioned saturating a 10Gb link. How about saturating four 10Gb links on a single machine? It can be done with the E5 processors and the software architecture I described above.

We're shooting for 10 million packets processed per second on a single, ~$20K machine. That's pretty sweet if you ask me, and a hell of a lot more than Node.js can do.

justincormack12y ago

Gigabit is slow. Saturating 10Gb+ is where the sweat might be broken or not. See eg netmap https://github.com/aarrpp/netmap which is in FreeBSD.

avsm12y ago

No, the idea is a bit broader. By structuring your application in a modular set of libraries, you can break up the "ambient dependencies" (e.g. the monolithic kernel) into what your application actually needs.

Once that's done, the Xen backend is just a matter of filling in the missing kernel components with OCaml libraries (or, in the case of OSv, with C libraries, or in HalVM's, Haskell libraries).

In the case of MirageOS though, we're using this fine-grained dependency control to implement other backends too. For instance, compiling the same source code to run as a FreeBSD kernel module or as a JavaScript library. There's already a Unix backend, so nothing special needs to happen to run it under Docker.

jasonwatkinspdx12y ago

Database storage managers could get some advantages by having direct access to the memory page maps.

ch4s312y ago

Those are great questions, I was wondering about the latter myself.

j / k navigate · click thread line to collapse

22 comments

mbreese12y ago

Is this any better than the Docker method of reusing the same base-OS and compartmentalizing the applications? Is there that much to be gained in avoiding kernel/user-space transitions?

stormbrew12y ago

michaelmior12y ago

You can achieve the isolation with jails and cgroups, but not the performance improvements.

stormbrew12y ago

And this is what I mean when I say that taken to its conclusion you're just reinventing processes.

I think this kind of performance claim needs to be solidly proven by something at least vaguely like a real running application to be taken as a given.

2 more replies

wmf12y ago

You've eliminated system calls but you have hypercalls; it's not clear whether this is faster than a container-based system that has system calls but no hypercalls.

1 more reply

hcarvalhoalves12y ago

> Reproducibility and uniformity of environment (...)

avsm12y ago

stormbrew12y ago

erichocean12y ago

Is there that much to be gained in avoiding kernel/user-space transitions?

Yes, there's at least an order-of-magnitude improvement in packet processing, for example, if you bypass the kernel.

Our mobile app runs WebSocket-like connections over UDP with libsodium for crypto, and we're moving our stack off of Node.js for exactly that reason.

foobarian12y ago

Things like file descriptor limits or numbers of connections are more of a pain in modern times, not necessarily a context-switch caused problem.

erichocean12y ago

Linux (and Unix in general) is "control plane" software. It was literally developed to replace the people that used to patch phone calls using physical cables.

That thing they were patching? That's the "data plane". It was incredibly high bandwidth relative to the control plane, since it was completely optimized for moving data.

We're shooting for 10 million packets processed per second on a single, ~$20K machine. That's pretty sweet if you ask me, and a hell of a lot more than Node.js can do.

justincormack12y ago

Gigabit is slow. Saturating 10Gb+ is where the sweat might be broken or not. See eg netmap https://github.com/aarrpp/netmap which is in FreeBSD.

avsm12y ago

Once that's done, the Xen backend is just a matter of filling in the missing kernel components with OCaml libraries (or, in the case of OSv, with C libraries, or in HalVM's, Haskell libraries).

jasonwatkinspdx12y ago

Database storage managers could get some advantages by having direct access to the memory page maps.

ch4s312y ago

Those are great questions, I was wondering about the latter myself.

j / k navigate · click thread line to collapse