Agent Sandboxes Are the Containment Boundary
Approval prompts slow risk, but they do not contain it. Only OS-level sandboxes limit the blast radius when agents act.
Article focus
Treatment: photo
Image source: Virtuo Doc via Wikimedia Commons
License: CC BY-SA 4.0
Executive summary
Approval prompts are a human policy layer. Sandboxes are a technical enforcement layer. The safest agent workflows use both: approvals for intent, sandboxes for containment. Codex CLI and Claude Code both moved in this direction, combining OS-level isolation with configurable approval policies.
How Codex and Claude Split Approval from Containment
The current generation of coding agents has moved beyond permission prompts alone. OpenAI documents Codex as a combination of approval policies and OS-level sandboxing, while Anthropic positions Claude Code sandboxing as the way to reduce approval fatigue without giving up containment. The evidence points in the same direction: user confirmation is policy, not enforcement.
Why Confirmation Is Not a Reliable Security Boundary
Enterprises should read this as a control-placement issue. Approval prompts express intent, but they do not reliably contain damage when users get tired, normalize exceptions, or simply do not understand the blast radius of a command. Sandboxing matters because it enforces a technical boundary even when the operator is rushed or wrong.
How Approval Policy and Sandbox Reach Divide the Risk
That's why vendors now split intent from containment. OpenAI documents Codex security as two layers that work together: the sandbox determines what is technically possible, and the approval policy determines when the agent must ask. In the Codex CLI, the default is an OS-enforced sandbox with network access disabled and write access limited to the workspace, while approval modes decide when the agent pauses for confirmation.
Codex CLI Has Two Dials
- Approval modes: Auto (default), Read-only, Full Access (how often Codex asks).
- Sandbox modes: read-only, workspace-write, danger-full-access (what Codex can do).
In Auto, Codex can read, edit, and run commands inside the working directory, then asks
before touching network or outside paths. Use /status to verify writable roots
and /approvals to adjust the policy mid-session.
Sandbox Mechanics: OS Enforcement, Not a UI Prompt
In Codex CLI, the sandbox is not just a UI prompt. It is enforced by the OS. OpenAI documents
macOS enforcement via Seatbelt using sandbox-exec, and Linux enforcement via
bwrap plus seccomp by default. The platform helpers include aliases such
as codex sandbox landlock, but Codex documents that as helper terminology rather
than the default Linux enforcement path. The same policy applies to every command the agent
spawns, not just the primary process. On Windows, Codex uses the Linux sandbox when running
inside WSL and offers an experimental native sandbox for non-WSL setups.
bwrap vs seccomp (Linux)
For Codex's documented Linux path, bwrap provides the process and filesystem
isolation boundary while seccomp filters the syscall surface. That pairing is the
default behavior called out in the OpenAI docs. Landlock is still useful related context for
Linux sandboxing, and the helper alias exists, but it should not be read here as Codex's
default enforcement mechanism.
Seatbelt (macOS)
Codex relies on macOS Seatbelt via sandbox-exec for OS-level enforcement. This
is the same class of primitives used by other agent tools that need containment without a
full container.
Where Sandbox Defaults Still Fight the Operator
Anthropic reports a similar conclusion: permission prompts alone do not scale. Claude Code now uses sandboxing to enforce filesystem and network isolation, with internal usage showing an 84% reduction in approval prompts. Their write-up describes a sandbox runtime built on OS-level primitives like Linux bubblewrap and macOS Seatbelt, and the Claude docs position sandboxing as the default way to reduce approval fatigue while keeping tool access bounded. Claude Code also gates network access through a proxy with domain allowlists, and enables sandboxing via the /sandbox command.
Where Sandboxing Still Breaks Down in Practice
Even with better defaults, teams still hit sharp edges. The community experience is mixed. Some Cursor users ask for allowlists to auto-run safe
commands. Some Claude Code users report confusion when sandbox settings appear to be bypassed
by fallback flows. OpenAI also notes that Codex sandboxing can fail inside containers that do
not support the needed blocked namespace, setuid bwrap, or seccomp
operations, in which case you should rely on the container boundary and run Codex with a
full-access sandbox mode inside that container. The consistent theme is that sandboxes are
necessary but not effortless.
Operational Playbook
- Make sandboxing the default: containers or OS sandboxes for all agent runs.
- Keep network off unless needed: use explicit allowlists when supported.
- Expose context clearly: surface active sandbox + approvals via
/statusand/approvals. - Test the sandbox: use Codex
sandboxordebugcommands to validate what is blocked. - Expect edge cases: missing kernel features, CLI fallbacks, and allowlist gaps happen.
Enterprise Operating Controls: 3LS as the Policy Layer Above Local Containment
3LS complements sandboxing by making the enterprise policy layer explicit. The sandbox limits what the agent can technically do on the host, while 3LS governs when sensitive actions, prompts, or tool requests should be blocked, reviewed, or logged as policy events across environments. That matters because organizations need more than local containment. They need consistent operator visibility into how those boundaries are being used, relaxed, or bypassed.
How to Roll Out Sandboxing as a Controlled Default
Make sandboxing the default for agent workflows, document the approval modes teams are allowed to use, and test the containment boundary under real workloads rather than assuming the vendor default is enough. Then put reviewable policy around network access, dangerous commands, and any local override path so “safe enough” does not devolve into personal preference.
Continue reading