Microsoft previewed ACI AI Sandboxes at Build, an execution model built for autonomous AI agents that can spin up 10,000 isolated environments in roughly two seconds. The move stakes Azure's claim on a new category of agentic compute, and it reshapes how teams should think about where their agents run.
Microsoft used its Build stage to preview something that sits squarely at the intersection of two trends most cloud teams are already tracking: the rise of autonomous AI agents and the long-running effort to make serverless containers start faster and isolate harder. The preview is called ACI AI Sandboxes, an extension of Azure Container Instances aimed at giving AI agents a secure, disposable place to execute their own generated code.
For anyone advising on cloud strategy, this is less a product launch and more a signal about where the major providers think compute is heading. It deserves a careful read rather than a headline skim.

What Changed
The core argument Microsoft is making is that traditional containers were never built to be a security boundary for code written by an AI model at runtime. Process-isolated containers share a kernel, which is acceptable when you control and trust the workload. It becomes a real problem when the workload is code an agent generated thirty seconds ago, possibly in response to an untrusted prompt.
ACI already addressed part of this by using Hyper-V isolation, giving each container instance VM-grade separation rather than relying on namespace tricks. The new piece is Direct Virtualization, which Microsoft refers to as L1VH. The idea is to let isolated workloads run much closer to the hardware while keeping strong isolation guarantees, cutting the overhead that normally comes with a virtualization layer.
Layered on top of that, an AI Sandbox becomes a defined runtime where an agent can execute generated code, run tools and plugins, reach enterprise resources, perform reasoning steps, and do all of this independently from every other agent on the platform. Each sandbox lives in its own isolated environment, which is what makes secure multi-tenant execution at cloud scale plausible.
The headline demo, shown during Mark Russinovich's session on serverless container advances, was scale. Using memory snapshotting, ACI reuses pre-initialized memory images instead of running a full boot sequence for every new environment. The claimed result is 10,000 sandboxes launched in roughly two seconds. For agentic systems that may need thousands of short-lived execution environments at once, startup latency is the difference between a usable architecture and a theoretical one.
Provider Comparison
The sandbox-for-agents concept is not unique to Azure, and that context matters when you are weighing a strategy.
AWS has been routing similar needs through Firecracker, the microVM technology that underpins Lambda and Fargate. Firecracker was purpose-built for fast-starting, strongly isolated microVMs, and it is open source, which has made it the foundation for a number of third-party code-execution sandboxes. AWS has not, as of this writing, packaged a first-party "AI agent sandbox" with the same framing Microsoft is using, but the primitives are mature and battle-tested in production at enormous scale.
Google Cloud approaches the same territory through Cloud Run and gVisor, its user-space kernel that intercepts syscalls to isolate untrusted workloads. gVisor trades a small performance cost for a strong security boundary without full hardware virtualization, a different bet than Microsoft's hardware-adjacent Direct Virtualization approach.
Specialist vendors are also worth naming because they often lead the incumbents here. E2B and Modal have built businesses specifically around fast, isolated sandboxes for AI-generated code, and they typically offer sub-second cold starts today rather than as a preview.
The distinction that sets Microsoft's pitch apart is the convergence story. Azure is describing a single compute Control Plane that hosts virtual machines, ACI, AI Sandbox Groups, Direct Virtualization, L2 utility VMs, and the underlying host infrastructure together. The claim is that an improvement in one layer benefits every workload type above it. AWS and Google both have unified control planes internally, but neither has marketed agent sandboxes as a converged compute layer in quite this way.

Pricing and Migration Considerations
Pricing for AI Sandboxes is not public, which is expected for an emerging preview behind an early-access form. That absence is itself a planning factor. ACI today bills per second for vCPU and memory, and if sandboxes inherit that model, the cost question becomes one of volume rather than unit price. Ten thousand two-second environments is cheap; ten thousand environments that each run for ten minutes while an agent reasons is a different invoice. Teams evaluating agentic architectures should model the duration distribution of their sandboxes, not just the count.
The migration angle is more strategic than technical. Agent execution sandboxes are a relatively new layer, so most organizations are not migrating off an existing equivalent. The lock-in risk shows up in how tightly your agent runtime couples to a provider-specific sandbox API. If your orchestration layer assumes Azure's sandbox lifecycle and resource model, moving that workload to Firecracker-backed infrastructure later means reworking the execution boundary, not just changing an endpoint. The defensive posture is to keep the agent logic and the sandbox provisioning behind your own abstraction, the same discipline that made Kubernetes portability achievable.
Direct access to hardware, including GPUs, is the other piece worth tracking. Microsoft describes AI workloads as a mix of CPU-based orchestration, GPU-accelerated inference, embedding models, tool execution, and agent runtimes coexisting on shared infrastructure. If Direct Virtualization delivers GPU passthrough into strongly isolated sandboxes at the promised performance, that combination is harder to replicate with user-space isolation approaches, and it could become a genuine differentiator for inference-heavy agent fleets.
Business Impact
For decision-makers, the practical takeaway is that the unit of agentic compute is starting to standardize, and the providers are positioning early. If your roadmap includes agents that write and run code, browse, or call tools on behalf of users, the security boundary around that execution is no longer an afterthought you can solve with a container and good intentions.
The scale numbers also reframe cost and capacity planning. An architecture that can launch and discard thousands of environments in seconds invites a usage pattern where every agent task gets a fresh, clean sandbox. That is excellent for security and reproducibility, and it pushes spend toward consumption-based unpredictability. Finance and platform teams should align on guardrails before agent workloads scale, not after.
My advice to clients is to treat ACI AI Sandboxes as a strong signal rather than a commitment. Prototype against it if you are already an Azure shop and want to understand the model, but keep your agent execution layer abstracted so you retain leverage across AWS, Google, and the specialist sandbox vendors. The category is real and the major clouds are clearly investing, which historically means rapid feature movement and shifting price floors over the next several quarters. Position to take advantage of that competition rather than betting the architecture on a single preview.

Comments
Please log in or register to join the discussion