Deep dive · 2026 · 04 · 04 · 11 min read

Five sandboxes, one runtime.

A tour through local, Docker, SSH, Singularity, and Modal — and why we don't think one sandbox is enough for a real operator.

Most agent frameworks ship with one execution backend, usually a Docker container, and call it a day. We shipped five. Here's why.

Different workloads need different isolation. A shell command on a developer laptop wants the local FS. A long-running training job wants Modal's burst GPUs. A batch over a research cluster wants Singularity. A trusted remote box wants SSH. A throwaway script wants Docker.

We unified them behind a single Sandbox trait in the Rust runtime. Each backend implements spawn, exec, copy_in, copy_out, and kill. The agent picks the backend per task via a small declarative policy. The policy is overridable per skill, per channel, per user.

The interesting bit is that the agent itself can author the policy. After a few weeks of use, it has usually figured out that the 'render this CSV' tasks belong in local, the 'fine-tune this adapter' tasks belong in Modal, and the 'ssh into prod and tail the log' tasks belong in SSH. We don't write that mapping. It writes it.

Five sandboxes feels like a lot until you watch a real operator at work for a day. Then it feels like the minimum.