Microsoft Foundry Expands AI Capabilities with Three New Open-Source Models

Microsoft Foundry now supports Qwen3-Coder-Next for coding agents, Qwen3-ASR-1.7B for multilingual speech recognition, and Z-Image for text-to-image generation, offering enterprise-grade AI capabilities across multiple modalities.

Microsoft has expanded its Foundry AI platform with three new open-source models from Hugging Face, providing enterprise developers with advanced capabilities across coding, speech recognition, and image generation. The additions—Qwen3-Coder-Next, Qwen3-ASR-1.7B, and Z-Image—demonstrate Microsoft's commitment to offering diverse AI tools that balance performance with practical deployment considerations.

Qwen3-Coder-Next: Efficient Coding Agent for Complex Workflows

The 80B parameter Mixture-of-Experts model activates only 3B parameters during inference, delivering coding agent capabilities with a massive 256k context window. This extreme efficiency makes advanced coding agents viable for local deployment on consumer hardware while maintaining competitive performance.

Key capabilities include:

Long-horizon reasoning for complex development tasks
Advanced tool usage and execution recovery
Autonomous debugging with failure recovery
Multi-file code synthesis across large codebases

The model excels at agentic workflows that go beyond simple code completion, handling tasks like transaction reconciliation services for fintech platforms, complete with database integration, logging utilities, and comprehensive unit testing.

Qwen3-ASR-1.7B: Multilingual Speech Recognition Breakthrough

This 1.7B parameter model achieves state-of-the-art accuracy across 52 languages and dialects, including 30 languages, 22 Chinese dialects, and multiple English accents. The model integrates language identification directly, eliminating the need for separate detection pipelines.

Performance highlights:

Outperforms GPT-4o, Gemini-2.5, and Whisper-large-v3 on major benchmarks
97.9% average accuracy for automatic language identification
Handles diverse audio types including singing voice and background music
Configurable context length up to 4096 tokens

The model's versatility extends to call center analytics, content moderation, and meeting transcription, processing customer service recordings across multiple languages without managing separate models per language.

Z-Image: Professional-Grade Text-to-Image Generation

Z-Image from Tongyi-MAI is a 6B parameter undistilled foundation model that preserves complete training signal with full Classifier-Free Guidance support. This enables complex prompt engineering and negative prompting capabilities that distilled models cannot achieve.

Creative capabilities include:

High output diversity for multi-person scenes with varied compositions
Support for resolutions from 512×512 to 2048×2048 at any aspect ratio
28-50 inference steps for optimal quality
Aesthetic versatility across photorealistic, anime, and stylized illustrations

The model excels at professional creative workflows, generating distinct character identities and handling complex visual styles for applications ranging from e-commerce product photography to entertainment content.

Deployment and Integration

Microsoft Foundry users can deploy these models directly through the Hugging Face collection in the Foundry model catalog or via one-click deployments from the Hugging Face Hub. The platform handles secure, scalable inference configuration automatically.

Getting started requires:

Browsing the Hugging Face collection in Foundry
Selecting supported models for deployment
Configuring managed endpoints in Azure
Accessing documentation for implementation guidance

These additions strengthen Microsoft's position in the enterprise AI market by offering production-ready models that push the boundaries of their respective domains while maintaining practical deployment considerations. The integration with Microsoft Foundry provides developers with enterprise-grade infrastructure for deploying these advanced AI capabilities at scale.

For developers looking to implement these models, Microsoft provides comprehensive documentation and a GitHub repository for staying updated on the latest releases and implementation patterns.