The Microsoft Foundry Agent Lab walks developers through nine incremental demos that introduce core agent capabilities—tool calling, UI decoupling, server‑side tools, code interpretation, RAG, MCP integration, toolbox governance and self‑hosting—while keeping the architecture simple with a single model‑router deployment and server‑side conversation state.

Building AI Agents with Microsoft Foundry: A Progressive Lab from Hello World to Self‑Hosted

What changed?

Microsoft Foundry released a structured, open‑source Agent Lab that turns the traditionally chaotic process of building AI agents into a step‑by‑step curriculum. Instead of a monolithic example that mixes retrieval‑augmented generation, tool‑calling, streaming, and UI code, the lab provides nine self‑contained demos, each adding exactly one new primitive. All demos share the same Foundry SDK, a single model‑router deployment, and server‑side conversation management via the Responses API. This approach reduces the on‑ramp friction for engineers and delivers a reusable reference architecture for production‑grade agents.

Provider comparison – why Foundry’s model‑router matters

Feature	Microsoft Foundry (Model‑Router)	OpenAI Direct Calls	Anthropic / Cohere
Routing logic	Automatic selection based on task complexity, cost and latency; zero code required	Developer writes custom logic or selects a single model per request	Typically static model selection; custom routing must be built by the user
Cost optimisation	Routes cheap factual queries to grok‑4‑1‑fast‑reasoning and reserves frontier models for code or tool‑heavy turns	All calls hit the same model; cost can spike when a heavy model is used for simple queries	Similar to OpenAI; no built‑in cost tiering
Latency	Fast models for simple turns keep response times low; heavy turns use more capable but slower models only when needed	Latency dictated by the single model chosen; may be higher than necessary for trivial requests
Complexity	No routing code, only declare the task (e.g., need tool calling)	Must manage model selection manually, increasing boilerplate
Integration	Works natively with Foundry’s Responses API, MCP, Toolbox, and built‑in tools (WebSearch, CodeInterpreter, FileSearch)	Requires separate SDKs or wrappers for each capability
Security	Uses DefaultAzureCredential – no API keys in code, managed identity in production	API keys often stored in env files or secret managers; higher secret‑management overhead

The lab’s empirical data (see MODEL-ROUTER.md) shows the router picking the right model for each demo, from cheap factual recall to frontier code‑generation models, eliminating the need for developers to maintain a model‑selection matrix.

Business impact – how the progressive demos translate to production decisions

1. Start with the minimum viable agent (Demo 0)

Code footprint: < 30 lines using the Foundry SDK.
Conversation state: Stored server‑side via the Responses API, removing the classic bug of lost history in multi‑instance deployments.
Authentication: DefaultAzureCredential works locally (az login) and in Azure (managed identity) – no secrets to rotate.

2. Add function tools only when needed (Demo 1)

Control: Function tools run in the client process, letting you enforce custom error handling, rate limits, or compliance checks.
Strict schema (strict=True) guarantees well‑formed JSON arguments, reducing production parsing errors.

3. Decouple UI from agent logic (Demo 2)

Portability: The same agent can be surfaced via a terminal, a Tkinter desktop app, or later a web UI without changing the agent definition.
Team alignment: Front‑end developers focus on UX, back‑end engineers on prompt engineering and tool integration.

4. Leverage server‑side built‑in tools (Demo 3‑5)

WebSearchTool: Removes the client‑side loop; the model decides when to search and Foundry returns citations.
CodeInterpreterTool: Provides a sandboxed Python environment inside Foundry, ideal for data‑analysis or chart generation workloads.
FileSearchTool + vector store: Enables Retrieval‑Augmented Generation without managing an external vector DB; the vector store lives in Foundry and persists across sessions.
Business benefit: Faster time‑to‑market for RAG‑based support bots or analytics assistants, with lower operational overhead.

5. Adopt Model Context Protocol (MCP) for external system integration (Demo 6‑7)

MCP servers expose tools (e.g., GitHub issues) over a standard wire protocol; agents discover and call them without hard‑coding function signatures.
Toolbox adds governance: versioned snapshots, central ownership, and permission scoping (allowed_tools).
Risk mitigation: Human‑in‑the‑loop approval before side‑effecting calls prevents accidental data changes.

6. Self‑hosted agents when you need full control (Demo 8)

Custom inference path: Deploy a Dockerised server that implements the Responses protocol, allowing pre‑ or post‑processing that cannot be expressed in a system prompt.
Use cases: Compliance‑driven environments, A/B testing of prompt variants, or orchestrators that need to expose themselves as agents to other orchestrators.

Migration considerations

Migration Step	What to change	Impact on cost / ops
From local API keys to DefaultAzureCredential	Replace `openai.api_key` with `DefaultAzureCredential` and configure managed identity on Azure resources.	Reduces secret‑management burden; slight increase in Azure AD token acquisition latency (negligible).
From client‑side tool execution to built‑in tools	Swap `FunctionTool` definitions with `WebSearchTool`, `CodeInterpreterTool`, or `FileSearchTool`. Remove the tool‑calling loop in client code.	Cuts compute cost for tool execution (Foundry runs them on shared infrastructure) and simplifies error handling.
From single‑model calls to Model‑Router	Set `model=MODEL_DEPLOYMENT` where `MODEL_DEPLOYMENT` = `model-router`. Remove any per‑request model selection logic.	Optimises spend automatically; latency improves for simple queries.
From ad‑hoc conversation storage to Responses API	Use `openai.conversations.create()` and pass `conversation.id` on each turn. Delete any local `history` arrays.	Enables horizontal scaling; eliminates state‑sync bugs.
From monolithic agent to toolbox‑governed tools	Register a Toolbox resource, pin agents to a specific version, and use `McpTool(server_label="toolbox", ...)`.	Centralises tool updates, reduces deployment friction across multiple agents.

Next steps for teams

Clone the repo – git clone https://github.com/microsoft-foundry/Foundry-Agent-Lab.git
Run the hello demo – validates credentials, project endpoint and the Responses API.
Iterate through demos 1‑8 – add the next primitive only when your product requirement demands it.
Evaluate model‑router logs – MODEL-ROUTER.md shows which model was chosen; use this data to set cost budgets.
Plan production deployment –
- Use managed identity for authentication.
- Store vector‑store IDs and toolbox version numbers in Azure Key Vault.
- Enable human‑in‑the‑loop approvals for any side‑effecting MCP tools.
Contribute – the lab is MIT‑licensed; submit improvements or new demos via GitHub Issues.

Resources

Foundry Agent Lab repo: https://github.com/microsoft-foundry/Foundry-Agent-Lab
Foundry SDK docs: https://learn.microsoft.com/azure/ai-studio/
Responses API quickstart: https://learn.microsoft.com/azure/ai-studio/responses-api-quickstart
Model Router overview: https://learn.microsoft.com/azure/ai-studio/model-router
Model Context Protocol: https://modelcontextprotocol.io
Azure Identity (DefaultAzureCredential): https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/identity/azure-identity

Bottom line

The Microsoft Foundry Agent Lab demonstrates that a production‑ready AI agent can be built with under 100 lines of code, a single model‑router deployment, and server‑side state management. By progressing through the nine demos, engineers gain a clear mental model of where to place tool logic, how to govern tool access, and when to take ownership of the inference path. The result is faster delivery, lower operational risk, and a cost‑optimised stack that scales from a simple “hello‑world” bot to a self‑hosted, multi‑tool orchestrator.

#Microsoft Foundry #AI_Agents #Model Router #RAG #self-hosted

Building AI Agents with Microsoft Foundry: A Progressive Lab from Hello World to Self‑Hosted

Building AI Agents with Microsoft Foundry: A Progressive Lab from Hello World to Self‑Hosted

What changed?

Provider comparison – why Foundry’s model‑router matters

Business impact – how the progressive demos translate to production decisions

1. Start with the minimum viable agent (Demo 0)

2. Add function tools only when needed (Demo 1)

3. Decouple UI from agent logic (Demo 2)

4. Leverage server‑side built‑in tools (Demo 3‑5)

5. Adopt Model Context Protocol (MCP) for external system integration (Demo 6‑7)

6. Self‑hosted agents when you need full control (Demo 8)

Migration considerations

Next steps for teams

Resources

Bottom line

Comments

1. Start with the minimum viable agent (Demo 0)

2. Add function tools only when needed (Demo 1)

3. Decouple UI from agent logic (Demo 2)

4. Leverage server‑side built‑in tools (Demo 3‑5)

5. Adopt Model Context Protocol (MCP) for external system integration (Demo 6‑7)

6. Self‑hosted agents when you need full control (Demo 8)