No,no, Docker is not a sandbox for untrusted code.
We live in a bizarre world where somehow "you need a hypervisor to be secure" and "to install this random piece of software, run curl | sudo bash" can live next to each other and both be treated seriously.
The kata-containers [1] runtime takes a container and runs it as a virtual host. It works with Docker, podman, k8s, etc.
It's a way to get the convenience of a container, but benefits of a virtual host.
This is not do-all-end-all, (there are more options), but this is a convenient one that is better than typical containers.
LPEs abound - unprivileged user ns was a whole gateway that was closed, io-uring was hot for a while, ebpf is another great target, and I'm sure more and more will be found every year as has been the case. Seccomp and unprivileged containers etc make a huge different to stomp out a lot of the attack surface, you can decide how comfortable you are with that though.
I would expect major distributions to have embargoed CVE access specifically to prevent this issue.
Start here to help give you ideas for what to research:
https://linuxsecurity.com/features/what-is-a-container-escap...
If you look at the interface contract, both containers and VMs ought to be about equally secure! Nobody is an idiot for reading about the two concepts and arriving at this conclusion.
What you should have written is something about your belief that the inter-container, intra-kernel attacker surface is larger than the intra-hypervisor, inter-kernel attack surface and so it's less likely that someone will screw up implementing a hypervisor so as to open a security hole. I wouldn't agree with this position, but it would at least be defensible.
Instead, you pulled out the tired old "education yourself" trope. You compounded the error with the weasely "are considered" passive-voice construction that lets you present the superior security of VMs as a law of nature instead of your personal opinion.
In general, there's a lot of alpha in questioning supposedly established "facts" presented this way.
Anything including GNU/Linux kernel can be broken with such security vulnerabilities.
This is not a weakness in the design of containers. `npm install`, on the other hand, is broken by design (due to post-install.
Also, you can use the runsc (gvisor) runtime for docker, if you are careful not to expose vulnerable protocols to the container there will be nothing escaping it with that runtime.
Out of those, only first one is actually exploitable in common setups.
CVE-2019-5736 requires either attacker-controlled image or "docker exec". This is not likely to be the case in the "untrusted python" use case, nor in many docker setups.
CVE-2022-0185 is blocked by seccomp filter in default installs, so as long as you don't give your containers --privileged flags, you are OK. (And if you do give this flag, the escape is trivial without any vulnerabilities)
That is to say, Docker is typically a security win because you get things like seccomp and user/DAC isolation "for free". That's great. That's a win. Typically exploitation requires a way to get execution in the environment plus a privilege escalation. The combination of those two things may be considered sufficient.
It is not sufficient for "I'm explicitly giving an attacker execution rights in this environment" because you remove the cost of "get execution in the environment" and the full burden is on the kernel, which is not very expensive to exploit.
Dockler is better for running arbitrary code compared to the direct `npm install <random-package>` that's common these days.
I moved to a Dockerized sandbox[1], and I feel much better now against such malicious packages.
1 - https://github.com/ashishb/amazing-sandbox @task(name="analyze_data", compute="MEDIUM", ram="512MB", timeout="30s", max_retries=1)
def analyze_data(dataset: list) -> dict:
# Your code runs safely in a Wasm sandbox
return {"processed": len(dataset), "status": "complete"}
This is fundamentally awkward in a language with as absurdly flexible a type system as Python. What if that list parameter contains objects that implement __getattr__? What if the output dict has an overridden __getattr__?Even defining semantics seems awkward, especially if one wants those semantics to simultaneously make sense and have any sort of clear security properties.
edit: a quick look at the source suggests that the output is deserialized JSON regardless of what the type signature says. That’s certainly one solution.
We stick to JSON to make sure we pass data, not behavior. It avoids all that complexity.
I’ve been building on that foundation: script runs in sandbox, all commands and file writes get captured, human-in-the-loop reviews the diff before anything executes. It’s not adversarial (block/contain) but collaborative (show intent, ask permission).
Different tradeoff than WASM or containers: lighter than VMs, cross-platform, and the user sees exactly what the agent wants to do before approving.
WIP, currently porting to PyPy 3.8 to unlock MacOS arm64 support: https://github.com/corv89/shannot
Long, long ago, there was "repy"[1][2]. (This is definitely included in the "none succeeded" bucket, FWIW.)
I have been looking towards some kind of quick-start qemu option as a possibility, but the project will take a while.
If we want to isolate untrusted code at a very fine-grained level (like just a specific function), VMs can feel a bit heavy due to the overhead, complexity etc
This is so true
How does it work? Which WASM euntime does it use? Does it use a Python jnterpreter compiled to WASM?
https://github.com/mavdol/capsule
(From the article)
Appears to be CPython running inside of wasmtime
---
That is not save at all. You could always hijack builtin functions within untrusted code.
def untrusted_function():
original_map = map
def noisy_map(func, *iterables):
print(f"--- Log: map() called on {func.__name__} ---")
return original_map(func, *iterables)
globals()['map'] = noisy_map