The case for sandboxing your dev tools
Package managers execute arbitrary code. AI coding agents execute arbitrary commands. Both get full user permissions by default. Here's why that has to change.
Two classes of tools routinely execute untrusted code with full user permissions on your laptop. One is old, one is new, and both are being actively exploited.
Class one: package managers
Every time you run npm install, pip install, or cargo install, you’re authorising arbitrary code execution on your machine. Packages can run postinstall scripts, setup.py blocks, and build steps — all with the same permissions as you. They can read ~/.ssh/id_rsa, send ~/.aws/credentials to a remote server, enumerate your other repos, and install a launch agent that survives reboot.
Recent supply-chain attacks show this isn’t theoretical:
- Axios npm compromise (March 2026) — backdoored versions of a package with 100M weekly downloads shipped a cross-platform RAT via a
postinstallhook. Anyone whose CI or laptop rannpm installduring the window received it. - Shai-Hulud worm (September 2025) — a self-replicating worm compromised 500+ npm packages, harvested GitHub PATs and cloud credentials, and used them to push trojaned versions of other packages owned by its victims. Every infected developer became a new distribution vector.
The typical response is “audit your dependencies.” That doesn’t scale. A modern Node project has thousands of transitive dependencies; a single compromised maintainer anywhere in that tree is enough.
Class two: AI coding agents
Claude Code, Cursor, Copilot Agents, and their peers read your codebase, edit files, and run shell commands on your behalf. When they work correctly they’re extraordinary productivity tools. When something goes wrong — a prompt injection buried in a README, a hallucinated rm -rf, a confused agent that decides to “clean up” a directory — the blast radius is everything you can touch: SSH keys, cloud credentials, keychain entries, other projects, unrestricted outbound network.
The usual mitigations are permission prompts (“allow this command?”) and YOLO-mode regret. Neither is structural. An attacker who can craft input reaching the agent can eventually craft input the agent will follow, and the user who clicks through prompts all day will click through the bad one too.
What a real boundary looks like
The answer isn’t more prompts or better linters. It’s running these tools somewhere they can’t see anything they don’t need.
Silo does this by running each tool inside a lightweight Apple Container VM — a real Linux kernel, not a process namespace. The VM sees the current project directory and explicitly-passed environment variables. That’s it.
| Without Silo | With Silo | |
|---|---|---|
~/.ssh, ~/.aws, ~/.gnupg | Full access | Do not exist |
| All environment variables | Visible | Only explicitly passed |
| Persistent RAT in user space | Possible | Dies with ephemeral VM |
| Lateral movement to other projects | Possible | Only current project mounted |
| macOS keychain, browser data | Accessible | Blocked (separate kernel) |
| Unrestricted outbound network | Yes | Proxy allowlist per tool |
This is the same kind of boundary you’d get running your agent in a cloud sandbox, but with none of the latency, none of the git-mirroring pain, and none of the billing surprises. It starts in ~600ms on your own hardware.
What stops working
Almost nothing, in practice. You still run python script.py. You still run npm install. You still run claude in the project root. The difference is that the postinstall that just tried to cat ~/.ssh/id_rsa got back “no such file” — because there’s nothing there to steal.
If you’re writing software on macOS and your dev loop involves running third-party code or an AI agent, the question isn’t whether you need a sandbox. It’s which one, and whether the overhead is low enough that you’ll actually use it.
← all posts