Why your AI agents need a sandbox (and what Cloudflare just shipped)

Maroš Bednár

March 25, 2026

4 min read

Server room with blue LED lighting representing cloud infrastructure

Containers are too slow for isolating AI-generated code. Not a little slow, orders of magnitude slow. Seconds to start, hundreds of megabytes per instance, and when you need a thousand concurrent sandboxes, you're looking at an infrastructure project that eats a quarter.

That's the short version of what this article is about, and if you've already reached the same conclusion, you can skip to the section on Dynamic Workers below.

The container problem

The standard approach to running untrusted code: spin up a Docker container with an isolated filesystem and network. It works. For an internal tool where the agent handles 50 requests per day, it's fine. For a customer-facing product with thousands of concurrent sessions generating and executing code, it falls apart fast. Start time alone kills the user experience, and orchestrating thousands of containers through Kubernetes is a project in itself.

We ran into this on a client project in Q1, building an agent that generates TypeScript from natural language input and executes it immediately. The model was the easy part. The question was: where does the generated code actually run? And "on the same server as everything else" is not an answer, it's a security incident waiting for a reason.

What Cloudflare shipped

On March 24, Cloudflare launched the Dynamic Worker Loader in open beta. I think it's the most interesting infrastructure release this quarter. The concept: instead of containers, use V8 isolates. V8 powers Chrome and Node.js. An isolate is a separate execution context within the same process, with its own heap, own stack, and zero access to the host. Starts in milliseconds.

The numbers Cloudflare published: 100x faster startup than containers, 10-100x less memory, unlimited concurrent sandboxes (they claim millions of requests per second), zero latency because the isolate runs on the same machine as the parent Worker.

I take Cloudflare's benchmarks with a grain of salt because they're selling Cloudflare. But even at half those numbers, it's a step change from containers.

In practice: your main Worker receives a request, an AI model generates a TypeScript function, the Worker calls the Dynamic Worker Loader, which creates a V8 isolate, loads the code, executes it, returns the result. After the response, the isolate is discarded. No state, no residue, no access to the parent.

Code Mode

There's a pattern Cloudflare calls Code Mode that I find genuinely interesting. Instead of the standard tool-calling protocol (model generates JSON, runtime executes, model analyzes result, calls next tool), the agent writes a single TypeScript function that chains the entire workflow.

Cloudflare's measurement: 81% reduction in token usage versus tool calls. I haven't verified this independently, but directionally it makes sense. Fewer roundtrips, less context window spent on intermediate results.

A practical example: the agent gets "create an invoice in Stripe for customer X with items Y." Instead of four separate tool calls, it writes one function that does the entire flow in a single pass.

The constraints

V8 only. So TypeScript and JavaScript, not Python, not Go, not native binaries. It's not designed for long-running tasks either; an isolate processes one request and gets discarded. Pricing sits at $0.002 per unique Worker per day plus standard CPU and invocation charges (waived during beta).

My take

If you're building agents that generate and execute code, isolation isn't optional. Dynamic Workers are one good answer: fast, cheap, scales well. But not the only answer. Sometimes you still need a container. Sometimes an API gateway restriction is enough. It depends on what the agent does and what data it touches.

The harder question, which Cloudflare doesn't answer (and shouldn't), is how you design the boundary between trusted and untrusted code in an agent system. The sandbox is just the execution layer. What goes inside it, what it's allowed to call, and how you validate its output: that's the architecture problem. If you're working through it, we've written about agent vs. automation decisions and multi-model strategies that touch on this.

If you're building an agent that runs user-generated code and want to talk architecture, get in touch.

Back to blog

Robotic hand and human hand representing controlled AI automation

AI Agent Governance Checklist Before You Connect CRM, ERP, or Email

An AI agent can save hours, but only if permissions, logs, approvals, owners, and failure paths are designed before it touches production systems.

5 min read

Search interface and website analytics on a laptop screen

AI Overviews and AI Mode: SEO for a Business Website in 2026

Google says the basics still matter for AI Overviews and AI Mode. The difference is that weak content and technical debt have less room to hide.