Microsoft's new Microsoft.App/SandboxGroups resource runs LLM-generated code in per-session microVMs that boot in under a second, scale to zero, and deny network egress by default. It lands in a market already crowded by E2B, Cloudflare Sandboxes, and Fly.io Sprites, but bets on Azure-native identity and ARM tooling to win over teams already in the ecosystem.
Microsoft has pushed Azure Container Apps Sandboxes into public preview, adding a new ARM resource type, Microsoft.App/SandboxGroups, built specifically to execute untrusted code that AI agents generate at runtime. Each sandbox boots from an OCI disk image in under a second, scales to thousands of concurrent instances, and costs nothing when idle. That billing shape maps directly onto the short, bursty execution pattern that agentic workloads produce.

What's new
The core unit is a per-session microVM, isolated from the host, the platform, and every other sandbox running on the same hardware. You bring any OCI-compliant container image, and the platform handles provisioning from pre-warmed pools, multi-tenant isolation, and the full lifecycle from startup through teardown. You are not standing up the orchestration yourself.
Sandboxes are grouped into Sandbox Groups, which act as the management and configuration boundary. If you have worked with Container Apps Environments, the mental model is similar, except these groups are tuned for short-lived workloads. A group carries the shared settings that apply to every sandbox inside it: network egress policy, managed identity assignment, lifecycle rules, and resource tiers.
Three operational features make this usable in production rather than just demos:
- Snapshot-based suspend and resume. Full memory and disk state survives across sessions. An agent can pause a multi-step investigation, or a dev environment with packages already installed, scale it to zero, and resume later with no re-initialisation.
- Egress denied by default. Outbound traffic is blocked unless a host is explicitly allowlisted, and that rule is enforced at a proxy layer inside the sandbox itself.
- Entra managed identities. Both system-assigned and user-assigned identities work, so a sandbox can authenticate to Azure services without baking credentials into the image or threading secrets through environment variables.
Why it matters
The threat here is concrete, not hypothetical. When an LLM writes code and an agent runs it in-process, the execution surface becomes the attack surface. A Python planner that looks harmless, fetching a URL, reading an environment variable, calling exec(), can quietly exfiltrate API keys or pull an arbitrary payload using nothing but the standard library. Without a hard wall between generated code and the host, any capable model is one prompt injection away from a postmortem.
Until now, teams building multi-tenant platforms, CI/CD automation, or LLM-backed code interpreters rolled their own isolation. That usually meant container runtimes locked down with restrictive seccomp profiles, or dedicated Kubernetes clusters running Kata Containers. Both work, and both demand continuous operational investment to keep the boundary tight as the workload evolves.
Microsoft also shipped an Agent Governance Toolkit that integrates with the sandboxes through the agt-sandbox Python package. It adds two enforcement layers that operate independently:
- AST scanning and tool allowlists, applied before a snippet ever runs.
- Egress allowlist enforcement at the network boundary inside the sandbox.
Because the layers are independent, a denied call never reaches the execution environment, and an outbound request to a non-allowlisted host fails at the proxy regardless of what the in-process policy permits. Defense in depth, with the two controls unable to undercut each other.
Worth paying attention to: Microsoft names the products already running on this fabric, including Cloud Sandboxes in GitHub Copilot, Foundry Hosted Agents, and Azure Container Apps Express. Rather than inventing a fresh trust-based abstraction for customers, Microsoft is exposing the same isolation layer it runs its own developer products on.

How it stacks up
The preview arrives in a busy field, and each competitor stakes out a different position:
- E2B runs Firecracker microVMs purpose-built for agent code execution, advertises sub-200ms cold starts, and offers BYOC for teams with data-residency requirements. It has picked up adoption across a number of Fortune 500 shops.
- Cloudflare Sandboxes, launched in April 2026, provides persistent isolated Linux environments with active-CPU pricing and snapshot-based session recovery, aimed at teams already on Workers.
- Fly.io Sprites, out since January 2026, leans persistent-by-default with Firecracker microVMs and 100GB NVMe storage, on the argument that rebuilding state each time wastes latency.
Azure Container Apps Sandboxes are not trying to win purely on cold-start numbers. The differentiator is Azure-native integration: Entra identity, ARM-native resource management, and egress control all reachable through tooling teams already use, on top of the same infrastructure that backs GitHub Copilot, with no orchestration layer to babysit.
The trade-off is honest. If you sit outside Azure, need GPU-heavy execution, want BYOC for data residency, or prefer open-source isolation primitives you can inspect and self-host, the dedicated providers give you more room to maneuver. The decision comes down to where your identity and networking already live. For an Azure-centric platform team, pulling sandbox isolation into the same ARM and Entra control plane removes a meaningful chunk of glue code. For everyone else, the specialists still offer flexibility that a cloud-native service, by design, does not.

Comments
Please log in or register to join the discussion