Securing Multi-Agent AI Systems: Microsoft's Entra ID OBO Solution for Databricks Integration

Microsoft addresses a critical security gap in enterprise AI systems with Entra ID On-Behalf-Of flow, enabling multi-agent architectures to maintain user identity and access controls when querying backend services like Databricks.

The enterprise AI landscape faces a significant security challenge: how to maintain user identity and access controls when AI agents query backend services on behalf of users. Traditional implementations often use shared service accounts or Personal Access Tokens (PATs), effectively bypassing row-level security (RLS), column masking, and other data governance policies. This creates a security vulnerability where users can potentially access data they shouldn't through AI agents.

Microsoft has addressed this challenge through a novel implementation of Entra ID On-Behalf-Of (OBO) secure flow in a custom multi-agent LangGraph solution, specifically designed for Databricks Genie. This approach enables AI agents to query and modify data while preserving all role-based access control (RBAC) policies and maintaining a complete audit trail.

The Architecture: Key Components

The solution integrates several critical technologies:

Chainlit: A Python-based web interface for LLM-driven conversational applications, customized to meet specific UI requirements while avoiding the need for a bespoke React front end.
Azure App Service: Provides managed hosting with built-in authentication support and autoscaling capabilities.
LangGraph: An open-source multi-agent orchestration framework that coordinates the interactions between different AI agents.
Azure Databricks Genie: A natural language to SQL agent that enables users to query data through conversational interfaces.
Azure Cosmos DB: Serves as long-term memory and checkpoint storage for the system.
Microsoft Entra ID: Functions as the identity provider with OBO support, enabling secure token exchange between systems.

The architecture features specialized agents:

Genie: Handles read-only natural language queries with per-user OBO authentication
Task Agent: Manages sensitive operations like SQL modifications with human-in-the-loop approval and OBO authentication
Memory: A shared agent that doesn't require per-user authentication

The Problem with Standard Chainlit OAuth Integration

While Chainlit was successfully integrated with Microsoft Entra ID for OAuth authentication, the default implementation revealed significant limitations. The standard approach assumes Microsoft Graph scopes, which creates several barriers for enterprise integration:

The access token provided is scoped specifically for Microsoft Graph API
This token cannot be used for OBO flow to downstream services like Databricks
The token's audience is graph.microsoft.com, not the target application

For OBO to function correctly, the access token must have:

An audience matching the application's client ID
Scopes that include custom API permissions (e.g., api://{client_id}/access_as_user)

The Solution: Custom Entra ID OBO Provider

The breakthrough came with the development of a custom OAuth provider that replaces Chainlit's built-in authentication. The key insight was requesting api://{client_id}/access_as_user as the scope, which ensures the returned access token has the correct audience for OBO exchange.

Since this token cannot be used to call Graph API (due to incorrect audience), the solution extracts user information from the ID token claims instead. This approach maintains user identity while enabling secure token exchange.

The OBO Token Exchange Process

Once the user's access token is acquired with the correct audience, the system exchanges it for a Databricks-scoped token using Microsoft Authentication Library (MSAL). The resulting token:

Has an audience matching the Databricks resource ID
Contains the user's identity information (UPN, OID)
Can be used with Databricks SDK/API
Respects all Unity Catalog permissions configured for that specific user

Per-User Agent Architecture

A critical design decision was to never cache user-specific agents globally. Each user receives their own Genie agent instance, ensuring complete isolation of data access permissions. This approach prevents potential cross-contamination of user data and maintains strict access controls.

Integration with Databricks Genie

The integration point involves passing the OBO-acquired token to the Databricks SDK's WorkspaceClient. The token flows through the system from Chainlit's OAuth callback → session config → LangGraph config → agent creation, ensuring every Genie query runs with the authenticated user's permissions.

Human-in-the-Loop for Sensitive Operations

While Databricks Genie handles natural language queries (read-only), the system also supports custom SQL execution for data modifications. Since these operations can potentially DELETE or UPDATE data, the implementation includes human-in-the-loop approval using LangGraph's interrupt feature.

The OBO token ensures that even when executing user-authored SQL, the query runs with the user's permissions—they can only modify data they're authorized to change. A separate LLM-based intent analysis detects potentially destructive operations before they're executed.

Entra ID App Registration Requirements

For organizations implementing this solution, the Entra ID app registration requires specific configurations:

API Permissions: Azure Databricks → user_impersonation (admin consent required)
Expose an API: Scope access_as_user on URI api://{client-id}
Redirect URI: {your-app-url}/auth/oauth/azure-ad/callback

Key Technical Insights

The implementation revealed several important technical considerations:

Token Audience Matters: OBO fails if the initial token has the wrong audience
Don't Cache User-Specific Clients: This breaks user isolation and creates security risks
ID Tokens Contain User Info: When Graph API calls aren't possible, use token claims
Human-in-the-Loop for Destructive Operations: Even with RBAC, require explicit user confirmation

Migration Considerations and Cost Analysis

Organizations considering this implementation should evaluate several factors:

Migration Path:

Existing systems using PATs or shared service accounts can be incrementally updated
The solution can be implemented alongside existing authentication systems during a transition period
Organizations should prioritize high-risk data access points first when implementing OBO

Cost Implications:

Azure App Service pricing varies based on compute resources required
Entra ID OBO doesn't incur additional licensing costs beyond standard Entra ID subscriptions
Databricks costs will remain consistent, but improved governance may prevent costly data breaches
Development resources required for implementation (typically 2-3 weeks for experienced teams)

Return on Investment:

Prevents potential data breaches that could cost millions in regulatory fines and reputational damage
Enables secure adoption of AI without compromising existing security investments
Reduces manual oversight for data access, freeing up security and compliance resources

Business Impact and Benefits

This implementation delivers significant business value:

User Identity Preservation: Maintains complete user identity across AI agents
RBAC Enforcement: Ensures proper access controls at the Databricks/Unity Catalog level
Complete Audit Trail: Shows exactly which user made each query
Zero-Trust Architecture: The AI agent never has more access than the authenticated user
Governance Compliance: Supports regulatory requirements through proper access controls

Future Considerations

While this solution effectively addresses current needs, the article suggests that as multi-agent AI systems evolve, organizations will require more centralized services that standardize identity and user delegation across systems. Emerging capabilities in platforms like Microsoft Entra Agent ID and Azure AI Foundry point in this direction, though they remain in preview and primarily focus on first-party ecosystems.

Conclusion

Microsoft's implementation of Entra ID OBO flow in multi-agent systems represents a significant advancement in securing enterprise AI architectures. By preserving user identity across AI agents and maintaining proper access controls, this approach enables organizations to leverage the power of AI while maintaining strict data governance and security standards.

The solution demonstrates that with careful architecture design and proper implementation of identity delegation protocols, enterprises can create secure multi-agent AI systems that respect existing security policies while enabling powerful new capabilities. As AI becomes increasingly integrated into enterprise operations, such security innovations will become essential components of responsible AI deployment strategies.

This approach applies beyond Databricks to any Azure service supporting OAuth 2.0, making it a versatile solution for organizations building multi-cloud AI architectures. It also forms part of the AI governance foundation for enterprise custom multi-agent AI solutions, ensuring compliance with Microsoft SFI (Secure Future Initiative) and zero trust principles.

#Azure #Entra ID #OBO #Databricks #AI Security