In a Jetbrains IDE, for example, you check a devcontainer.json file into your repository. This file describes how to build a Docker image (or points to a Dockerfile you already have). When you open up a project, the IDE builds the Docker image, automatically installs a language-server backend into it, and launches a remote frontend connected to that container (which may run on the same or a different machine from where the frontend runs).
If you do anything with an AI agent, that thing happens inside the remote container where the project code files are. If you compile anything, or run anything, that happens in the container too. The project directory itself is synced back to your local system but your home directory (and all its credentials) are off-limits to things inside the container.
It's actually easier to do this than to not, since it provides reusable developer tooling that can be shared among all team members, and gives you consistent dependency versions used for local compilation/profiling/debugging/whatever.
DevContainers are supported by a number of IDEs including VSCode.
You should be using them for non-vibe projects. You should DEFINITELY be using them for vibe projects.
--runtime=runsc
--cap-drop=ALL
--security-opt no-new-privileges:true
it's pretty tight. That's how I use coding agents, FWIW.I found cloning the repo when creating the devcontainer works best in JetBrains for some reason and I hard code the workspace directory so it’s consistent between JetBrains and vscode
https://github.com/anthropics/claude-code
It's a great starting point, and can be customized as needed. With the devcontainer CLI, you can even use it from a terminal, no GUI/IDE required.
https://news.ycombinator.com/item?id=47546014
That should be enough to get you going. It can be customized to your heart's content.
It sounds like if you make devcontainers point at an existing Dockerfile it should be easy to make these work together, so you and teammates both use the same configuration. I haven't used devcontainers though.
I need something like that though that's one of the thing that pains me the most while trying to use vim/nvim for dev
Now, i run YOLO and haven't had any issue and my subscription lasts much longer with less token consumption!
And then it has to bypass sandbox to run those command with elevated permission.
This double tripe boosts token usage.
I don't think average developer workflow can be really limited to a workspace. You'll need commands which touch your system or require more privilege
The downside for me and the main reason I do use vms less than I did a few months ago is that I need my agentic coding tools to use development tools a lot. And those tools need a lot of resources. And I have those resources on my laptop. Which is a nice mac book pro with plenty of RAM and 16 CPUs. I can run vms on this thing without issues of course. But tools just run a lot faster when I run them outside those VMs. And agentic coding tools run builds all the time. We're talking some really non trivial time savings here. Watching qemu build a thing for 10 minutes that I know should build in 45 seconds is painful. Especially if it happens over and over again.
The trick is doing sandboxing without performance impact. And very soon you'll also want to be able to run local models. I've been toying with the latest qwen and gemma models on my laptop. I haven't gotten around to doing coding with those just yet. But apparently they aren't completely horrible at it. That won't work on most cloud based vms. Unless you get a really big and expensive one. You could actually make that work if you only use them for a few minutes.
I was using Docker containers for sandboxing but it was annoying at times, not so much for the performance hit (which wasn’t noticeable running in OrbStack) but various little papercuts like no shared clipboard, node_modules pulling in Linux binaries or macOS binaries depending on whether I ran npm install from inside the sandbox or my own shell, etc.
With agent-safehouse, I get the isolation I want (more customizable than with Docker) without needing a VM or container.
You don't give your GH keys, email credentials and ssh keys to a coworker. They have their own accounts with scoped permissions. Need them to read an email? Forward it. Need them to work on a repo? Add them as a contributor and enforce the same branch policies you would for any human.
There are still risks, but they're similar to delegating work to humans, so it's up to you how much access and trust to give.
All that said, no way in hell I’m giving either access to production databases or environments.
These tools have only been in use for a short time and the current harnesses/system prompts are quite limited. Claude code is mostly limited to your codebase where you have version control. Excel is different.
I foresee that once people hand over more power to full agents there will be some nasty surprises. Im sure there will eventually be demand for some kind of limits
Being worried about escape from isolation etc in a person dev context seems like overkill though
The "agent never sees keys" approach prevents key exfiltration, but it doesn't prevent agent from nuking what it has access to, nor prevent data exfiltration.
The best advice I heard to protect against prompt injection was "just use Opus" ( ... which was great advice before they lobotomized it ;)
But even without injection, most of the horror stories are from random error, or the AI trying to be helpful (e.g. stealing your keys or working around security restrictions, because they trained or to really want to complete a task.[1])
tl;dr yolo
[0] https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/
[1] https://www.reddit.com/r/ClaudeAI/comments/1r186gl/my_agent_...
Yes, agents.md yells not to mess with prod.
Probably what nudged it to run on prod in the first place
No, 'safety oriented' lab has a clause like that which can't be revoked historically. Anthropic, like the majority of 'don't be evil' firms is apart of the great masquerade.
The whole experience was a bit jarring. When it knows I use nix, the the thing can easily `nix-shell -p nmap` its way into learning a lot more about my entire network than I am comfortable with. I think I'll edit the Containerfile further to also make Claude Code a user that can't install anything.
It's really like some "agent" (yeah I know, but I mean really an external person) takes control of your computer, with the same privileges as you. Idk why I had to see this happen in front of my eyes to fully realize this.
Of course every computer program has these rights, and you have to trust any of these devs...
Note that putting it in container changes jack shit, if it still has network access, it can scan your network anyway, and it needs access to install language deps and such to "do its work"
It's a security nightmare.
That's why VLANs are nice, as is requiring your container system (or VM or whatever) to attach its vNICs to a VLAN-tagged bridge on the host rather than the untagged interface that your trusted software uses. If the only thing that the container can hit on your LAN is your router, and your router refuses to forward traffic from that untrusted VLAN to anywhere other than the internet, then that cuts off another avenue for intelligence gathering.
That all assumes that you can't exploit the container daemon to get root, of course.
Perfect is the enemy of good.
Don't just rawdog a coding agent because a perfectly viable solution (containers) takes an hour or two of work to set up.
There's a world of difference between "it can scan your network" and "I just uploaded my private SSH keys to the cloud".
At home I have no propietary software at all modulo some original GBC ROMs I dumped to play with emulators, but that is not my 'daily computing' usage but an act of nostalgia.
That ability to transparently start a container and connect it to the SSH pipr is useful for isolation methods for coding agents involving containers and I imagine it would work equally well for things like Firecracker VMs. It's made my experiment working with an "immutable OS" (Universal Blue based) much more ergonomic. Also, it's the only way I've found to let Zed run remotely inside a container without having the container run a ssh server.
> The actual development happens on a rented server
Why not Hyper-V or libvirt/KVM? VM escapes aren't a thing in real life (or VMs from hyperscalers wouldn't exist), so why deal with additional cost, latency, and third-party trust when you could just run it yourself?
Essentially using a repo that doesn’t matter with the coding agent and then creating a cross-repo PR to the real repo.
Almost back to point A then. If the server is compromised in some way, they can use (not copy) OP's keys and use them to clone repos/inject code/etc..
Treat it as a colleague, making PRs that you review.
All you need is a separate limited user account on your computer. Multi-user Unix-y systems were designed for this kind of thing for decades.
My entire development environment is literally just "sudo".
I use my personal laptop for $WORK and everything work related is done via the VM.
or just have a linux vps and ssh in for $5 a month
I'm not seeing your point. Are you saying that I shouldn't use sudo because I might accidentally "run the wrong command"?
I know all the commands that I run and what they do.
> your kernel is not isolate
Am I more afraid of an npm package exploiting a zero day kernel vulnerability on my mac? Or just stealing my AWS keys and installing a crypto miner? Sudo suits my threat model just fine.
My m3 MacBook pro is a million times more powerful dev machine than a cheap $5 vps. Why would I waste my time with that.
https://www.reddit.com/r/ClaudeAI/comments/1n1moqy/how_did_c...
https://www.reddit.com/r/ClaudeCode/comments/1sa3fx8/claude_...
https://old.reddit.com/r/ClaudeAI/comments/1jfidvb/claude_tr...
claude-cli executes whatever is in the BROWSER env variable to open your browser at a current URL, so I pointed it at a simple shell script that writes the URL to a named pipe which is mounted into the container. The sandbox tool outside of the container is reading from that named pipe. When it receives a URL to open, it pops up a confirmation dialog with info about the URL. If you accept, it opens it in your host browser.
The second step is, the callback URL after you sign in on the claude website wants to connect back to a port on localhost to complete the sign in. If the sandbox is being run with host networking mode, this just works fine as claude cli has already opened that port so it's listening on the host network. However if it is not running in host networking mode, the sandbox tool figures out what port it need to listen on from looking at the URL, listens to it, and when it is hit, it just podman exec's curl inside the container to complete the callback.
When they try to open a browser, they also print the URL to the console. Open that in your browser and go through an authentication flow; it'll end forwarding you to a localhost URL like http://127.0.0.1:8080/authorization-code/callback?code=XXXX&... which will fail.
Copy that callback URL, connect to your VM/docker container, and curl it.
The curl stage requires the agent make a call to auth.whatever-vendor.com so if it fails at this stage, check your VM/container network settings. And make sure you quoted the curl right so the & wasn't misinterpreted.
It'll then save a file at ~/.codex/auth.json or ~/.claude.json or similar, so you won't need to log in again. The secret in this file will periodically rotate, so you need to mount it read-write not read-only.
Gotta love how someone downvoted this.
- user and home directory for data
- crontab for scheduled jobs
- cgi for serving user space apps
- rsync for backups
We even rediscovered email patches but with agent to agent help making and applying them.
It’s simpler for us to operate and the agent to figure out.
# Create a new sandbox copying . as workdir (default container, but you can choose vm)
yoloai new mybugfix . --isolation vm
# attach to it (it has tmux already)
yoloai attach mybugfix
# Chat with the bot inside...
# Happy with its work? Diff it to be sure
yoloai diff mybugfix
# Happy with the changes? Apply them to your workdir
yoloai apply mybugfix
# All done? Destroy the sandbox
yoloai destroy mybugfix
The agent stays isolated at all times. No access to your secrets (except what you want), no access to your workdir until you apply. You can also easily restrict network access.This does the same thing as in the blog post, except that there are a LOT of gotchas and minutiae and some yak shaving involved if you want to keep doing it manually.
I've gone through the whole path the author has, and finally had to admit that it's too much fiddling around to do it manually. Easier to just have a cmdline tool that does it for you. That's why I built it in the first place.
The above example doesn't specify workdir mounting mode, so it would be copy, not overlay.