For example, in this article the author links to the 'list of dropped capabilities in the Docker code'. As it happens, I wrote that list quite some time ago, and wrote it for lxc-gentoo, a guest-generation script for raw LXC against an earlier kernel version with an earlier LXC userspace. Not only is the list now out of date, it's no longer using the preferred approach. Why is this? Instead of explicit drop ("allow all, deny some") after some months of raising the issue one of the LXC devs finally added the 'lxc.keep' (ie. "deny all, allow some") which is architecturally more secure against things like kernel upgrades which add or modify kernel capabilities.
Furthermore, the docker people only included this when I added https://github.com/dotcloud/docker/commits/v0.5.0/lxc_templa... ... things as important as WARNING: procfs is a known attack vector and should probably be disabled if your userspace allows it. eg. see http://blog.zx2c4.com/749 and WARNING: sysfs is a known attack vector and should probably be disabled if your userspace allows it. eg. see http://bit.ly/T9CkqJ
Again, I fully support docker's efforts but the article is ... misleading at best.
There's a reason we keep saying docker is not yet production-ready.
Right now our focus is on usability and stabilizing the management API to make deployment-centric deployment awesome. You can be sure that before we tell anyone that they can use docker to sandbox untrusted code in a shared environment (which by the way is not the only use case of docker) we will be locking down our default lxc configuration and doing a sweep of all pending security issues.
For the record, we (dotCloud) have tens of thousands of lxc containers currently running untrusted code in production on shared infrastructure, and have had to monitor and maintain them 24/7 for several years. Before that we ran openvz. And before that, we ran vserver. So while docker itself may not yet be ready for production (and indeed we don't use it in production at dotcloud either), you don't need to worry about our stance on security. We care about it just as much as you do.
If you have a minute and can share, I'd love to hear why you switched away from vserver (and then openvz but especially vserver). Or maybe you have those transitions written up somewhere?
Just a heads-up: I know this isn't your fault, but docker.io does not say this on the front page, About, or FAQ that I can see. In fact, it currently says "same container that a developer builds and tests on a laptop can run at scale, in production".
Docker looks very interesting, thanks for your work.
Cheers,
First, I think it's a little disingenuous to say that your issue disappeared. No one is censoring the Docker issue list. If you could provide a bit more information (your github handle, the issue title, etc.) I'll be happy to investigate.
edit: the first point was addressed, thanks :)
Second, Docker is an open source project with a rich community and a great deal of contributors for any project, even more so for a project less than 6 months old. People like yourself with clear passion can only make it better. I encourage you to continue your contributions by opening an issue and working with the maintainers to solve it.
Unfortunately I don't have time to run docker. Right now I am working on a broader-goaled system internally which supports arbitrary virtualization platforms and integrates concerns around platform integrity, host integrity, failover, automated scale-out, network topology specification and development/operations processes.
Docker apparently aims to make deployment really easy, and does this for some subset of cases, but with ease of use sacrifices security for new users who cannot evaluate statements such as the comments I added to its template in the commits referenced above.
To be frank I am not sure this is a winning goal, and suspect that any attempt to criticize docker's place within broader concerns would more likely result in something close to negative feedback from the existing developer community rather than an abstract thoughtfest resulting in wins for everyone. Happy to discuss further by email.
So why does Docker still ship with lxc.drop? Well, a large number of people are still using LXC 0.7, which doesn't support lxc.keep, AFAIK. But it is very likely that Docker 1.0 will either require LXC 0.9, or totally get rid of LXC userland tools, or provide multiple implementations depending on what you have installed locally; and then lxc.keep will definitely kick in.
Also, the initial security choices of Docker represent a middle ground between "lock down everything" and "allow anything to happen". It had to be secure enough so that people could run regular app servers with a reasonable level of trust; and permissive enough to allow e.g. normal package managers to run.
Moreover, Docker is evolving: we recently added the "-privileged" flag (available in the master branch, and very probably in 0.6.0, due in a few days), allowing to switch between a more secure configuration, suitable for e.g. public PAAS environments, and a more permissive configuration, suitable for private PAAS, continuous integration, that kind of things. And this is just one step in that direction.
Err, where did you get that idea? I couldn't be less concerned about the fate of my docker 'contribution' of inline comments (which was simply given out of shock that nobody seemed to be considering these vectors, and was merely copied from lxc-gentoo).
My motivation in commenting here is to prevent people from getting the wrong idea about security and LXC, something the article, IMHO, failed to do. In fact, it came across as fairly misleading to my mind.
Like jpetazzo, I would love to see a working LXC exploit. In my case, "working" == "can get host root when given container root on Ubuntu 12.04 or later".
I mean — it's trivially easy to get access to a Linux VPS, for a ridiculously low price (sometimes for free). Now compare with something equivalent based on Solaris zones.
But yeah, I'll definitely update the blog post, thanks!
Why would you not consider it widely deployed? If it needs to be said: just because you don't use something doesn't mean that others aren't. Speaking only for us (I work for Joyent), we have deployed hundreds of thousands of zones into production over the years -- and Joyent was running with FreeBSD jails before that. And that's just us; there are many others in the illumos/SmartOS/OmniOS, Solaris and FreeBSD communities who have been running this technology in production -- broadly -- for years. Perhaps OS virtualization is a new technology for you, but understand that it's not new for everyone; some of us have been doing this for a while -- widely deployed and in production.
If yes, I would like to see an example of that (that works on systems with very minimal lockdown, i.e. using the device control group and kernel capabilities).
Otherwise, if you just mean that "0-day Linux root vulnerabilities can be used to escalate from non-root to root in a Linux Containers", that's a truism, and it also stands true for VMs or OpenVZ systems.
Just like 0-day vulnerabilities will help people to escalate from non-root to root in a FreeBSD jail or Solaris zone.
Plus, the user namespace functionality is fairly new and complex, and there have already been a few bugs found, e.g. [1]. I assume all the known bugs have been fixed, but that doesn't ensure that more aren't lurking somewhere.
VMs will always be more secure than containers, simply through defense in depth; the only question is whether you want to trade away some performance and flexibility to increase security.
This looks like what CoreOS is providing, a stripped down barebones host, with all other services not strictly necessary in the host moved to the containers.
>Capabilities turn the binary “root/non-root” dichotomy into a fine-grained access control system. Processes (like web servers) that just need to bind on a port below 1024 do not have to run as root: they can just be granted the net_bind_service capability instead. And there are many other capabilities, for almost all the specific areas where root privileges are usually needed.
This is awesome, has been a personal pain point in the past, trying to get JVM running as non-root in ubuntu server. Theoretically it's easy with IPTABLEs, but in practice it can be tricky to get working exactly right.
So far I've grabbed knowledge by reading paper on operating systems (and misunderstanding 80% of their content), reading man pages, reading Tanenbaum's textbooks, etc. But still I don't feel like I know or understand.
They say a lack of words for things render one blinds of their ignorance. Sometimes it's also that you just don't know what needs to be learnt.