Containers and Docker: how secure are they? (opens in new tab)

(blog.docker.io)

117 pointsjpetazzo12y ago54 comments

54 comments

34 comments · 7 top-level

contingencies12y ago· 10 in thread

I support docker in its efforts. However, docker is too cute, too hyped, and too rapidly developed to trust with your security as yet. Quite frankly, you have to understand a bit more than how to call an API to have faith in your infrastructure's inherent security.

For example, in this article the author links to the 'list of dropped capabilities in the Docker code'. As it happens, I wrote that list quite some time ago, and wrote it for lxc-gentoo, a guest-generation script for raw LXC against an earlier kernel version with an earlier LXC userspace. Not only is the list now out of date, it's no longer using the preferred approach. Why is this? Instead of explicit drop ("allow all, deny some") after some months of raising the issue one of the LXC devs finally added the 'lxc.keep' (ie. "deny all, allow some") which is architecturally more secure against things like kernel upgrades which add or modify kernel capabilities.

Furthermore, the docker people only included this when I added https://github.com/dotcloud/docker/commits/v0.5.0/lxc_templa... ... things as important as WARNING: procfs is a known attack vector and should probably be disabled if your userspace allows it. eg. see http://blog.zx2c4.com/749 and WARNING: sysfs is a known attack vector and should probably be disabled if your userspace allows it. eg. see http://bit.ly/T9CkqJ

Again, I fully support docker's efforts but the article is ... misleading at best.

shykes12y ago

Hi, docker maintainer here.

There's a reason we keep saying docker is not yet production-ready.

Right now our focus is on usability and stabilizing the management API to make deployment-centric deployment awesome. You can be sure that before we tell anyone that they can use docker to sandbox untrusted code in a shared environment (which by the way is not the only use case of docker) we will be locking down our default lxc configuration and doing a sweep of all pending security issues.

For the record, we (dotCloud) have tens of thousands of lxc containers currently running untrusted code in production on shared infrastructure, and have had to monitor and maintain them 24/7 for several years. Before that we ran openvz. And before that, we ran vserver. So while docker itself may not yet be ready for production (and indeed we don't use it in production at dotcloud either), you don't need to worry about our stance on security. We care about it just as much as you do.

spindritf12y ago

> Before that we ran openvz. And before that, we ran vserver.

If you have a minute and can share, I'd love to hear why you switched away from vserver (and then openvz but especially vserver). Or maybe you have those transitions written up somewhere?

1 more reply

duskwuff12y ago

Do you allow any of that untrusted code to run AS ROOT within a container, though? (If so: what capabilities do you allow it to have?)

3 more replies

pmahoney12y ago

> we keep saying docker is not yet production-ready

Just a heads-up: I know this isn't your fault, but docker.io does not say this on the front page, About, or FAQ that I can see. In fact, it currently says "same container that a developer builds and tests on a laptop can run at scale, in production".

Docker looks very interesting, thanks for your work.

1 more reply

simonebrunozzi12y ago

Very well done, Solomon. I really admire what dotCloud is doing in this space.

Cheers,

nickstinemates12y ago

I learned a lot from this reply, thank you :) It's clear you have a passion for containers (something with have in common) and security (something I'm not an expert on.)

First, I think it's a little disingenuous to say that your issue disappeared. No one is censoring the Docker issue list. If you could provide a bit more information (your github handle, the issue title, etc.) I'll be happy to investigate.

edit: the first point was addressed, thanks :)

Second, Docker is an open source project with a rich community and a great deal of contributors for any project, even more so for a project less than 6 months old. People like yourself with clear passion can only make it better. I encourage you to continue your contributions by opening an issue and working with the maintainers to solve it.

contingencies12y ago

I encourage you to continue your contributions by opening an issue and working with the maintainers to solve it.

Unfortunately I don't have time to run docker. Right now I am working on a broader-goaled system internally which supports arbitrary virtualization platforms and integrates concerns around platform integrity, host integrity, failover, automated scale-out, network topology specification and development/operations processes.

Docker apparently aims to make deployment really easy, and does this for some subset of cases, but with ease of use sacrifices security for new users who cannot evaluate statements such as the comments I added to its template in the commits referenced above.

To be frank I am not sure this is a winning goal, and suspect that any attempt to criticize docker's place within broader concerns would more likely result in something close to negative feedback from the existing developer community rather than an abstract thoughtfest resulting in wins for everyone. Happy to discuss further by email.

2 more replies

jpetazzoOP12y ago

Regarding the lxc.drop vs lxc.keep: of course, we eventually want to switch to the latter, since it's obviously better to "deny all, then allow some" than the opposite. And I can only be grateful that you provided those elements. It looks like you think like your contribution wasn't taken into account, but it definitely was, and I'm sorry that you feel that way.

So why does Docker still ship with lxc.drop? Well, a large number of people are still using LXC 0.7, which doesn't support lxc.keep, AFAIK. But it is very likely that Docker 1.0 will either require LXC 0.9, or totally get rid of LXC userland tools, or provide multiple implementations depending on what you have installed locally; and then lxc.keep will definitely kick in.

Also, the initial security choices of Docker represent a middle ground between "lock down everything" and "allow anything to happen". It had to be secure enough so that people could run regular app servers with a reasonable level of trust; and permissive enough to allow e.g. normal package managers to run.

Moreover, Docker is evolving: we recently added the "-privileged" flag (available in the master branch, and very probably in 0.6.0, due in a few days), allowing to switch between a more secure configuration, suitable for e.g. public PAAS environments, and a more permissive configuration, suitable for private PAAS, continuous integration, that kind of things. And this is just one step in that direction.

contingencies12y ago

It looks like you think... and I'm sorry that you feel that way.

Err, where did you get that idea? I couldn't be less concerned about the fate of my docker 'contribution' of inline comments (which was simply given out of shock that nobody seemed to be considering these vectors, and was merely copied from lxc-gentoo).

My motivation in commenting here is to prevent people from getting the wrong idea about security and LXC, something the article, IMHO, failed to do. In fact, it came across as fairly misleading to my mind.

daemon1312y ago

thank you for elaboration

jpetazzoOP12y ago· 8 in thread

By the way, if anyone knows of a documented exploit for LXC, I would love to hear about it. People (generally advocating VMs, zones, jails, OpenVZ...) will often say that "containers are not secure", but once you've taken some basic steps (like locking down kernel caps and device access) it becomes difficult to find an actual threat.

contingencies12y ago

I would love to hear about it

See http://blog.zx2c4.com/749 and http://bit.ly/T9CkqJ

ak21712y ago

Neither of these exploits works on stock Ubuntu 12.04 LTS, with LXC or otherwise (AppArmor kicks in).

Like jpetazzo, I would love to see a working LXC exploit. In my case, "working" == "can get host root when given container root on Ubuntu 12.04 or later".

1 more reply

jevinskie12y ago

Thank you so much for the first link! The very same "very black unix domain sockets magic" has been confounding me while reverse engineering a binary. OK, it calls recvmsg and then a wild FD appears from another process!? I had no idea...

justincormack12y ago

Any local root vulnerability will also work in a container, eg this one http://www.ubuntu.com/usn/usn-1914-1/ - note that a lot of kernel vulnerabilities are never really announced, just quietly fixed.

FooBarWidget12y ago

Well, there's been a number of local root exploits in recent years. That's worrying.

mjg5912y ago

A variant on http://grsecurity.net/~spender/msr32.c would have worked up until the capability check was added. Capabilities will help you, but if your business model is built on the assumption that the kernel performs capabilities checks everywhere it should then you really ought to be actively reviewing kernel entry points yourself.

foobarqux12y ago

Is there a guide for these "basic steps" for the uninitiated?

liuw12y ago

Do they count?

http://lwn.net/Articles/543509/ http://lwn.net/Articles/543273/

dap12y ago· 4 in thread

Good post, except that it's extremely misleading to use Solaris as the canonical example of non-Linux containers and then say that non-Linux containers "haven't had as much exposure" and "the source code isn't always available for peer review and auditing". Solaris containers (in Solaris first, and then illumos when Solaris became closed-source again) have been open source since 2005 and running in hostile production environments that whole time.

jpetazzoOP12y ago

True, I will update the blog post so that it feels less misleading. Source code for Solaris zones is indeed available; but I wouldn't consider it widely deployed. Of course, some people are using it in public hosting environments (the most notable example is probably Joyent); but I don't think that it's significant compared to the installed base of VServer, OpenVZ, or LXC out there.

I mean — it's trivially easy to get access to a Linux VPS, for a ridiculously low price (sometimes for free). Now compare with something equivalent based on Solaris zones.

But yeah, I'll definitely update the blog post, thanks!

bcantrill12y ago

Source code for Solaris zones is indeed available; but I wouldn't consider it widely deployed.

Why would you not consider it widely deployed? If it needs to be said: just because you don't use something doesn't mean that others aren't. Speaking only for us (I work for Joyent), we have deployed hundreds of thousands of zones into production over the years -- and Joyent was running with FreeBSD jails before that. And that's just us; there are many others in the illumos/SmartOS/OmniOS, Solaris and FreeBSD communities who have been running this technology in production -- broadly -- for years. Perhaps OS virtualization is a new technology for you, but understand that it's not new for everyone; some of us have been doing this for a while -- widely deployed and in production.

2 more replies

dap12y ago

Thanks for offering to update the post. Disclaimer: I work for Joyent, but I think it's both cheap and easy to get access to zone-based hosting. :) And I'm looking forward to Docker support for zones!

pjmlp12y ago

I only used containers in HP-UX back in 2000, and back then they were pretty much configured in all HP-UX boxes I had access to.

pacala12y ago· 4 in thread

Any 0-day Linux root vulnerability qualifies. Linux is a large system, do your own risk analysis.

jpetazzoOP12y ago

Are you implying that gaining root access inside a LXC container means that you can escalate to the host system, or to sibling containers?

If yes, I would like to see an example of that (that works on systems with very minimal lockdown, i.e. using the device control group and kernel capabilities).

Otherwise, if you just mean that "0-day Linux root vulnerabilities can be used to escalate from non-root to root in a Linux Containers", that's a truism, and it also stands true for VMs or OpenVZ systems.

Just like 0-day vulnerabilities will help people to escalate from non-root to root in a FreeBSD jail or Solaris zone.

teraflop12y ago

No, I think the implication is that the kinds of kernel bugs that allow you to escalate from a non-root user to root within a container (by corrupting kernel data structures, for instance) will probably also allow you to escalate to root at the host level. If you can change the UID of a process, why should it be harder to change the UID namespace as well?

Plus, the user namespace functionality is fairly new and complex, and there have already been a few bugs found, e.g. [1]. I assume all the known bugs have been fixed, but that doesn't ensure that more aren't lurking somewhere.

[1] http://lwn.net/Articles/543273/

JoshTriplett12y ago

Many (though not all) local kernel exploits that allow you to escalate to root will also allow you to run arbitrary code in kernel-space, and there's only one kernel, not one kernel per container.

VMs will always be more secure than containers, simply through defense in depth; the only question is whether you want to trade away some performance and flexibility to increase security.

1 more reply

contingencies12y ago

I would like to see an example of that...

http://blog.zx2c4.com/749 and http://bit.ly/T9CkqJ

1 more reply

SkyMarshal12y ago· 1 in thread

>Finally, if you run Docker on a server, it is recommended to run exclusively Docker in the server, and move all other services within containers controlled by Docker.

This looks like what CoreOS is providing, a stripped down barebones host, with all other services not strictly necessary in the host moved to the containers.

>Capabilities turn the binary “root/non-root” dichotomy into a fine-grained access control system. Processes (like web servers) that just need to bind on a port below 1024 do not have to run as root: they can just be granted the net_bind_service capability instead. And there are many other capabilities, for almost all the specific areas where root privileges are usually needed.

This is awesome, has been a personal pain point in the past, trying to get JVM running as non-root in ubuntu server. Theoretically it's easy with IPTABLEs, but in practice it can be tricky to get working exactly right.

secstate12y ago

To your first point, CoreOS is SmartOS for linux :)

AYBABTME12y ago

I'm very interested in all those things, but I clearly lack a trajectory for learning them. Is there a reference I could read or a 'name' for that domain? How does one become educated on these things?

So far I've grabbed knowledge by reading paper on operating systems (and misunderstanding 80% of their content), reading man pages, reading Tanenbaum's textbooks, etc. But still I don't feel like I know or understand.

They say a lack of words for things render one blinds of their ignorance. Sometimes it's also that you just don't know what needs to be learnt.

gouggoug12y ago

"No exploit has been crafted yet to demonstrate this, but it will certainly happen in the feature". But will it be considered a future? ;)

j / k navigate · click thread line to collapse

54 comments

34 comments · 7 top-level

contingencies12y ago· 10 in thread

Again, I fully support docker's efforts but the article is ... misleading at best.

shykes12y ago

Hi, docker maintainer here.

There's a reason we keep saying docker is not yet production-ready.

spindritf12y ago

> Before that we ran openvz. And before that, we ran vserver.

If you have a minute and can share, I'd love to hear why you switched away from vserver (and then openvz but especially vserver). Or maybe you have those transitions written up somewhere?

1 more reply

duskwuff12y ago

Do you allow any of that untrusted code to run AS ROOT within a container, though? (If so: what capabilities do you allow it to have?)

3 more replies

pmahoney12y ago

> we keep saying docker is not yet production-ready

Docker looks very interesting, thanks for your work.

1 more reply

simonebrunozzi12y ago

Very well done, Solomon. I really admire what dotCloud is doing in this space.

Cheers,

nickstinemates12y ago

I learned a lot from this reply, thank you :) It's clear you have a passion for containers (something with have in common) and security (something I'm not an expert on.)

edit: the first point was addressed, thanks :)

contingencies12y ago

I encourage you to continue your contributions by opening an issue and working with the maintainers to solve it.

2 more replies

jpetazzoOP12y ago

contingencies12y ago

It looks like you think... and I'm sorry that you feel that way.

daemon1312y ago

thank you for elaboration

jpetazzoOP12y ago· 8 in thread

contingencies12y ago

I would love to hear about it

See http://blog.zx2c4.com/749 and http://bit.ly/T9CkqJ

ak21712y ago

Neither of these exploits works on stock Ubuntu 12.04 LTS, with LXC or otherwise (AppArmor kicks in).

Like jpetazzo, I would love to see a working LXC exploit. In my case, "working" == "can get host root when given container root on Ubuntu 12.04 or later".

1 more reply

jevinskie12y ago

justincormack12y ago

FooBarWidget12y ago

Well, there's been a number of local root exploits in recent years. That's worrying.

mjg5912y ago

foobarqux12y ago

Is there a guide for these "basic steps" for the uninitiated?

liuw12y ago

Do they count?

http://lwn.net/Articles/543509/ http://lwn.net/Articles/543273/

dap12y ago· 4 in thread

jpetazzoOP12y ago

I mean — it's trivially easy to get access to a Linux VPS, for a ridiculously low price (sometimes for free). Now compare with something equivalent based on Solaris zones.

But yeah, I'll definitely update the blog post, thanks!

bcantrill12y ago

Source code for Solaris zones is indeed available; but I wouldn't consider it widely deployed.

2 more replies

dap12y ago

pjmlp12y ago

I only used containers in HP-UX back in 2000, and back then they were pretty much configured in all HP-UX boxes I had access to.

pacala12y ago· 4 in thread

Any 0-day Linux root vulnerability qualifies. Linux is a large system, do your own risk analysis.

jpetazzoOP12y ago

Are you implying that gaining root access inside a LXC container means that you can escalate to the host system, or to sibling containers?

If yes, I would like to see an example of that (that works on systems with very minimal lockdown, i.e. using the device control group and kernel capabilities).

Just like 0-day vulnerabilities will help people to escalate from non-root to root in a FreeBSD jail or Solaris zone.

teraflop12y ago

[1] http://lwn.net/Articles/543273/

JoshTriplett12y ago

Many (though not all) local kernel exploits that allow you to escalate to root will also allow you to run arbitrary code in kernel-space, and there's only one kernel, not one kernel per container.

VMs will always be more secure than containers, simply through defense in depth; the only question is whether you want to trade away some performance and flexibility to increase security.

1 more reply

contingencies12y ago

I would like to see an example of that...

http://blog.zx2c4.com/749 and http://bit.ly/T9CkqJ

1 more reply

SkyMarshal12y ago· 1 in thread

>Finally, if you run Docker on a server, it is recommended to run exclusively Docker in the server, and move all other services within containers controlled by Docker.

This looks like what CoreOS is providing, a stripped down barebones host, with all other services not strictly necessary in the host moved to the containers.

secstate12y ago

To your first point, CoreOS is SmartOS for linux :)

AYBABTME12y ago

They say a lack of words for things render one blinds of their ignorance. Sometimes it's also that you just don't know what needs to be learnt.

gouggoug12y ago

"No exploit has been crafted yet to demonstrate this, but it will certainly happen in the feature". But will it be considered a future? ;)

j / k navigate · click thread line to collapse