QCon AI Boston 2026: Engineering Production-Ready AI Systems with Python

QCon AI Boston 2026 reveals a comprehensive program focused on the practical challenges of deploying AI systems in production, with particular emphasis on context engineering, inference optimization, reliability, and AI integration into the software development lifecycle.

The full schedule for QCon AI Boston 2026 is now live, running June 1-2 at Boston University. This two-day program addresses the critical engineering challenges that follow successful AI demonstrations—getting agents into production, managing inference costs, ensuring system reliability, and integrating AI into the software development lifecycle.

Context Engineering for AI Agents

One of the most significant challenges in AI development is bridging the gap between impressive demos and production systems that can operate effectively within a company's specific services, data, and processes. Two key sessions address this challenge:

Ajay Prakash from LinkedIn will present "Context Engineering at LinkedIn: How We Built an Organizational Context Layer for AI Agents with MCP." This session explores how LinkedIn leveraged the Model Context Protocol (MCP) to help AI coding agents work effectively with internal services and frameworks. For Python developers, this represents an important approach to creating more adaptable AI tools that can understand and work with specific codebases rather than treating all environments identically.

Ricardo Ferreira from Redis will cover "Beyond Prompting: Context Engineering for Production-Grade AI," examining the data and retrieval context that shapes reliable LLM outputs. This session provides valuable insights for Python developers building production AI applications, emphasizing that successful implementation requires more than just prompt engineering.

Optimizing Inference Economics

For Python teams working at enterprise scale, inference cost and latency are critical architectural considerations. Three sessions explore different aspects of this challenge:

Khawaja Shams from Momento will present "Serving LLMs at Scale: The Hidden KV Cache Advantage," examining how KV cache optimization impacts GPU utilization, throughput, and "Time to First Token"—key metrics for Python applications using libraries like Hugging Face Transformers or LangChain.

Deepak Chandramouli and Bhumik Thakkar from Apple will discuss "Beyond the Prototype: Scaling Frame Agnostic AI Agent Infrastructure with Ray." This session covers the transition from development to production using Ray, a popular Python framework for building distributed AI applications. The insights will be valuable for Python teams looking to scale their AI solutions.

Jordan Nanos from SemiAnalysis will provide "From Fab To Token: The State Of The Market," analyzing the physical and economic bottlenecks in AI infrastructure. This data-driven perspective helps Python developers understand the broader context in which their AI applications operate.

Ensuring Reliability and Safety

Several sessions address the critical aspects of safety, evaluation, and trust in AI systems:

Bruna Pereira from DoorDash will examine "SafeChat: Building AI-Powered Safety Systems at Scale in a Real-Time Marketplace," offering practical insights for Python developers working on safety-critical applications.

Mallika Rao from Netflix will present "Adaptive Recommenders in the Real World: Inference, Evals, and System Design," covering the implementation of continuously learning recommendation systems—relevant for Python teams using libraries like TensorFlow Recommenders or PyTorch.

Susan Chang from Elastic will discuss "Building Reusable Evaluation Frameworks for Agentic AI Products," sharing methods for evaluating AI agents that have been proven in production environments.

AI Integration in the Development Lifecycle

The program also examines how AI is transforming software development practices:

Catherine Weeks from Red Hat will share "AI First, Quality Always: Agentic SDLC Adoption Case Study," offering practical guidance for Python teams adopting AI tools without compromising code quality.

Lizzie Matusov will deliver the opening keynote "The Five Stages of AI Maturity in Engineering Organizations," analyzing where teams typically stall in their AI adoption journey and how to overcome these obstacles.

For Python developers and teams, QCon AI Boston 2026 offers valuable insights into the practical challenges of building, deploying, and maintaining AI systems. The conference addresses the gap between AI research and production engineering, providing actionable knowledge for those working with Python's rich ecosystem of AI and machine learning libraries.

The full schedule is available at boston.qcon.ai, with early bird pricing and team discounts currently available.