This article explores a secure architecture for AI-driven infrastructure automation that combines Model Context Protocol (MCP), Open Policy Agent (OPA), and ephemeral runners to create governance boundaries between autonomous agents and critical systems. The approach establishes least-privilege access patterns, policy-as-code enforcement, and isolated execution environments to mitigate risks associated with autonomous agents making infrastructure changes.
Building a Least-Privilege AI Agent Gateway for Infrastructure Automation with MCP, OPA, and Ephemeral Runners
The rapid adoption of AI-driven automation in infrastructure operations presents a significant security challenge. As organizations delegate more operational tasks to autonomous or semi-autonomous agents, traditional access controls and approval processes become inadequate. This article examines a reference architecture that establishes clear governance boundaries between AI agents and the infrastructure they operate on, combining Model Context Protocol (MCP), Open Policy Agent (OPA), and ephemeral runners to create a least-privilege execution environment.
The Problem: Agents Without Guardrails
Many engineering teams are experimenting with automation beyond traditional scripts and pipelines. Instead of humans clicking through dashboards or manually approving changes (a practice often referred to as "ClickOps"), some organizations are beginning to delegate operational tasks to autonomous or semi-autonomous agents. These agents may generate infrastructure changes, trigger deployments, or respond to operational signals with little or no human intervention.
Unlike traditional CI/CD bots, which execute predefined pipelines with static permissions and deterministic inputs, agent-driven systems introduce dynamic decision-making and cross-system actions at runtime. This shift introduces a new class of risk. Unlike traditional automation, which is usually scoped to a single tool or workflow, AI-driven agents often operate across multiple systems: CI/CD platforms, cloud APIs, infrastructure as code tools, and internal services.
When these agents are granted broad or persistent permissions, they effectively inherit the same level of access as a highly privileged human operator, but without the same contextual judgment or accountability. For example, in a multi-region deployment, a standby region may not be actively serving traffic but is critical for failover. An agent responding to a cost optimization or remediation signal may misinterpret the lack of traffic as unused capacity and modify or terminate resources in the standby region, causing severe impact when traffic later fails over from the primary region, with no clear approval trail or human decision point to hold responsible.
The consequences of failure in this space are concrete. An agent misinterpreting an instruction can initiate destructive infrastructure changes, such as tearing down environments or modifying production resources. A compromised agent identity can be abused to exfiltrate secrets, create unauthorized workloads, or consume resources at scale. In practice, teams often discover these issues late, because traditional logs record what happened, but not why an agent decided to act in the first place.
For organizations, this liability creates operational and governance challenges. Incidents become harder to investigate, change approvals are bypassed unintentionally, and security teams are left with incomplete audit trails. Over time, this problem erodes trust in automation itself, forcing teams to either roll back agent usage or accept increasing levels of unmanaged risk.
One approach is to limit or block agent-driven actions entirely, but this action undermines the value agents are meant to provide. A more sustainable approach is to introduce an explicit control layer between agents and the systems they operate on. In this article, we focus on an AI Agent Gateway, a dedicated boundary that validates intent, enforces policy as code, and isolates execution before any infrastructure or service API is invoked. Rather than treating agents as privileged actors, this model treats them as untrusted requesters whose actions must be authorized, constrained, observed, and contained.
Design Principles
Before diving into the individual principles, it helps to establish what the AI Agent Gateway is at a structural level, drawing from established production security and platform patterns rather than AI-specific theory. At its core, the gateway acts as a control boundary between autonomous agents and infrastructure systems. Agents never interact with infrastructure APIs directly. Instead, every request passes through a centralized gateway that validates intent, enforces authorization rules, and delegates execution to isolated, short-lived environments.
This separation allows organizations to introduce AI-driven automation without giving agents persistent or unrestricted access to critical systems. The gateway architecture adheres to the following design principles:
- Policy as Code - Externalizes authorization logic into declarative policies (OPA), avoiding hardcoded access rules inside application code.
- Least Privilege - Prevents direct agent communication with infrastructure APIs. The gateway mediates every request and limits execution to the minimum required permissions.
- Ephemeral Execution - Forces actions to run in short-lived, isolated environments that are destroyed immediately after execution.
- Observability by Default - Tracks every request and execution, producing traces, metrics, and audit logs, enabling inspection and post-incident analysis.
- Versioning and Auditability - Tracks requests using plan hashes, idempotency keys, and immutable job metadata, ensuring repeatability and traceability.
- Local First, Cloud-Ready - Runs the same architecture locally for experimentation and testing, while remaining portable to production environments.
Reference Architecture
The gateway architecture follows a defense in depth model, a well-established security principle in infrastructure and cloud systems. Rather than relying on a single control to prevent misuse, defense in depth applies multiple, independent safeguards so that failure in one layer does not result in full system compromise. This approach is commonly used in Zero Trust networking, cloud IAM design, and production-grade CI/CD pipelines.
In agent-driven systems, relying on a single control, such as prompt constraints or static tool allow lists, is insufficient because agents make decisions at runtime and may act across multiple tools and systems in ways that cannot be fully anticipated or constrained by a single safeguard.
In the context of AI-driven automation, defense in depth means that no single component, neither the agent, nor the gateway, nor the execution environment, has enough authority on its own to cause damage. Each layer performs a narrow, well-defined role, and every transition between layers is validated. This principle is reflected in how the architecture deliberately separates who requests an action from where that action is executed.
AI agents are treated as untrusted requesters. They can discover capabilities and submit structured requests, but they never interact with infrastructure APIs directly. All execution happens behind a strict gateway that enforces validation, authorization, and isolation. The flow through the system is intentionally one-way, enforcing the invariant that no execution occurs without prior authorization and isolated execution:
- Discovery - The agent uses the Model Context Protocol (MCP) to discover which tools are available and what inputs they require.
- Request - The agent invokes a tool (for example, apply_infra) using a JSON-RPC call.
- Validation - The gateway validates the request schema, computes a plan hash, enriches the request with identity and context, and sends it to OPA for authorization.
- Decision - If OPA denies the request, the gateway returns a 403 response and execution stops. If approved, the request is converted into a job and placed onto the execution queue.
- Execution - A short-lived runner pulls the job, creates an isolated namespace, applies the infrastructure plan, and deletes the environment after completion.
- Observability - Metrics and traces are emitted at each stage, allowing dashboards to track policy decisions, execution latency, and failure modes in real time.
This separation ensures that even if an agent behaves unexpectedly due to prompt errors, misconfiguration, or compromise, the blast radius is constrained. Authorization decisions are made before execution, execution occurs in isolated environments, and every step is observable and auditable.
Component Overview
The AI Agent Gateway is designed as a composition of narrowly scoped components rather than a single, monolithic service. Each component addresses a distinct responsibility: request mediation, authorization, or execution so that agent behavior can be governed, audited, and evolved independently of infrastructure execution details.
This separation is intentional and reflects the core design goal of minimizing blast radius while keeping the system understandable and testable. The architecture is split into three parts, each communicating through well-defined contracts to support replaceability and testability:
- The Gateway (API layer) - Accepts agent requests, validates intent, enforces authorization decisions, and coordinates execution.
- The Policy Layer - Encapsulates all authorization and safety rules using policy as code.
- The Execution Layer - Performs approved actions inside isolated, short-lived environments.
This separation allows each layer to evolve independently. Policies can change without redeploying the gateway, execution environments can be hardened without touching authorization logic, and agents can be replaced or upgraded without impacting infrastructure controls.
The Gateway (MCP Layer)
We start by building the coordination layer of the system, rather than a decision-making component. We chose the Model Context Protocol (MCP) specifically because it decouples the agent from the tool definition. This decision allows us to swap the LLM or the agent framework without rewriting our governance layer, which is a crucial requirement for avoiding vendor lock-in.
We implemented the Gateway logic in TypeScript to handle two critical tasks: discovery and enforcement. First, we define the apply_infra tool structure so the agent understands the required inputs (plan, path, hash, and environment), constraining agent behavior to explicitly declared capabilities and preventing it from inventing or invoking undeclared actions.
When the tool is called, we rely on strict schema validation. We then pass the context to OPA. Note that we do not execute the infrastructure change here; we only queue it. This separation ensures that the gateway remains focused on request validation and coordination rather than execution details.
Policy as Code
We move authorization logic out of the TypeScript code and into Open Policy Agent (OPA). This decision allows us to enforce complex business rules without redeploying the gateway. For the policy engine, we defined policies to enforce four non-negotiable rules:
- RBAC - sre-bot has full access; deploy-bot is restricted to non-prod.
- Integrity - The plan hash must match a registered artifact.
- Safety - Plans ending in -destroy.plan are explicitly blocked.
- Change management - Deployments are only allowed Mon-Fri, 09:00-17:00 UTC.
By externalizing these rules into OPA, organizations can adjust security policies without modifying application code, enabling faster response to changing threat landscapes or business requirements.
The Ephemeral Runner
The runner is the "hands" of the system. We use Python to manage the lifecycle. It ensures the environment is clean before and after execution. The runner, implemented in Python, adheres to a strict workflow:
- Generate a unique namespace (run-uuid).
- Execute the plan using kubectl and tofu.
- Always delete the namespace, even if the job fails.
This workflow guarantees that we never leave "orphaned" resources running in the cluster. The ephemeral nature of these environments is critical for containing potential damage from misconfigured or malicious agent actions.
Scaling to Enterprise
The architecture described so far is intentionally minimal and suitable for local experimentation and controlled environments. However, as teams move from individual workflows to organization-wide adoption, several aspects of the system need to evolve while preserving the same control plane semantics.
One of the first pressure points is execution isolation. Kubernetes namespaces provide a reasonable sandbox for local testing and early prototypes, but they are often insufficient in regulated or multi-tenant environments. As adoption grows, teams typically move ephemeral runners into stronger isolation boundaries, such as lightweight virtual machines (Firecracker or Kata Containers, for example) or dedicated, short-lived Kubernetes clusters.
Another scaling concern is artifact trust. In early stages, validating a plan hash inside policy is often enough to prevent accidental drift. At enterprise scale, this approach does not hold. Plans must be traceable, verifiable, and attributable. Many teams address this by introducing a signed plan catalog backed by an internal artifact registry or tooling such as Sigstore.
As the impact of changes increases, fully automated execution is rarely acceptable. High-risk actions, such as production changes or destructive operations often require explicit human approval. Instead of embedding approval logic inside the agent or the runner, the gateway becomes the coordination point. When approval is required, the gateway returns a Pending status and instructs the agent to retry only after a signed approval token is issued through an external system such as Slack or Jira.
Finally, geography becomes a governance concern. In multi-region environments, execution must occur close to the infrastructure being managed, while control logic remains centralized. Agents should not decide where work runs. Instead, ephemeral runners are deployed regionally, and policy determines where execution is permitted. This policy decision prevents agents from crossing regulatory or data residency boundaries while preserving a single, consistent control plane.
Operational SLOs
Performance is not a secondary concern in agent governance. If authorization or execution becomes slow or unpredictable, teams will bypass the system. The following Service Level Objectives (SLOs) are designed to protect both developer trust and operational safety:
- Policy Decision Latency (< 100 ms) - Authorization must be fast enough to remain invisible to the agent.
- Runner Start Latency (< 2s Dev / < 5s Staging) - Ephemeral execution is only viable if startup cost remains low.
- Denied Actions (≤ 2%) - A high denial rate often indicates poor tool design or overly coarse policies.
- Sandbox Teardown Time (< 30s) - Cleanup latency directly affects blast radius.
- Audit Log Availability (< 5 min) - Governance is ineffective if evidence arrives after the fact.
These SLOs are enforced through alerts. When they degrade, automation needs to pause. This approach ensures that the system remains responsive and reliable as it scales.
Business Impact
Implementing a least-privilege AI agent gateway architecture provides several key business benefits:
Reduced Security Risk - By constraining agent actions through policy enforcement and isolated execution, organizations can significantly reduce the blast radius of potential agent errors or compromises.
Improved Compliance - The audit trail and policy-as-code approach make it easier to demonstrate compliance with regulatory requirements and internal governance standards.
Faster Incident Response - The comprehensive observability built into the system enables faster detection and diagnosis of issues when they occur.
Increased Trust in Automation - By establishing clear boundaries and controls, organizations can increase confidence in AI-driven automation without sacrificing security.
Operational Efficiency - The separation of concerns allows different teams to work on components independently, accelerating development and maintenance cycles.
Conclusion
Building and exercising this system locally, intentionally triggering policy violations, failed executions, and cleanup edge cases made one thing clear early on: Agent safety is not something you retrofit with documentation or model tuning. It only works when guardrails execute as part of the system itself.
The most effective control was placing governance outside the execution path. Static guidelines, access reviews, and best-practice documents were easy to bypass during automation experiments. In contrast, controls enforced by the gateway and evaluated on every request consistently held, because agents never interacted with infrastructure APIs directly.
Treating governance as a system boundary and not an afterthought changed how safely automation could evolve. Separating intent from execution also proved critical. Letting agents describe what they wanted to do, while runners controlled how it happened, simplified both safety and debugging.
Observability played an equally important role, even in a local setup. Traces and logs that captured policy decisions, execution steps, and sandbox cleanup made agent behavior inspectable rather than assumed. Instead of trusting that an agent "did the right thing", we could verify what happened, when it happened, and why a decision was allowed or denied.
Finally, ephemeral execution fundamentally changed how risky experimentation felt. Knowing that every action ran in a short-lived environment with mandatory teardown made it safe to test destructive scenarios without leaving residual state behind. This approach reduced the cost of failure and encouraged stricter policies, because mistakes were contained by design.
Taken together, these lessons point to a broader conclusion: The safety of AI-driven automation improves less through smarter models and more through explicit, enforceable boundaries. By decoupling intent (agents), authorization (policy as code), and execution (ephemeral runners), the least privilege AI agent gateway turns abstract AI risk into concrete engineering constraints. Trust in agents, in this model, is not a belief, it is something you can observe, measure, and enforce.

Comments
Please log in or register to join the discussion