Microsoft Foundry now supports Qwen3-Coder-Next for coding agents, Qwen3-ASR-1.7B for multilingual speech recognition, and Z-Image for text-to-image generation, offering enterprise-grade AI capabilities across multiple modalities.
Microsoft has expanded its Foundry AI platform with three new open-source models from Hugging Face, providing enterprise developers with advanced capabilities across coding, speech recognition, and image generation. The additions—Qwen3-Coder-Next, Qwen3-ASR-1.7B, and Z-Image—demonstrate Microsoft's commitment to offering diverse AI tools that balance performance with practical deployment considerations.
Qwen3-Coder-Next: Efficient Coding Agent for Complex Workflows
The 80B parameter Mixture-of-Experts model activates only 3B parameters during inference, delivering coding agent capabilities with a massive 256k context window. This extreme efficiency makes advanced coding agents viable for local deployment on consumer hardware while maintaining competitive performance.
Key capabilities include:
- Long-horizon reasoning for complex development tasks
- Advanced tool usage and execution recovery
- Autonomous debugging with failure recovery
- Multi-file code synthesis across large codebases
The model excels at agentic workflows that go beyond simple code completion, handling tasks like transaction reconciliation services for fintech platforms, complete with database integration, logging utilities, and comprehensive unit testing.
Qwen3-ASR-1.7B: Multilingual Speech Recognition Breakthrough
This 1.7B parameter model achieves state-of-the-art accuracy across 52 languages and dialects, including 30 languages, 22 Chinese dialects, and multiple English accents. The model integrates language identification directly, eliminating the need for separate detection pipelines.
Performance highlights:
- Outperforms GPT-4o, Gemini-2.5, and Whisper-large-v3 on major benchmarks
- 97.9% average accuracy for automatic language identification
- Handles diverse audio types including singing voice and background music
- Configurable context length up to 4096 tokens
The model's versatility extends to call center analytics, content moderation, and meeting transcription, processing customer service recordings across multiple languages without managing separate models per language.
Z-Image: Professional-Grade Text-to-Image Generation
Z-Image from Tongyi-MAI is a 6B parameter undistilled foundation model that preserves complete training signal with full Classifier-Free Guidance support. This enables complex prompt engineering and negative prompting capabilities that distilled models cannot achieve.
Creative capabilities include:
- High output diversity for multi-person scenes with varied compositions
- Support for resolutions from 512×512 to 2048×2048 at any aspect ratio
- 28-50 inference steps for optimal quality
- Aesthetic versatility across photorealistic, anime, and stylized illustrations
The model excels at professional creative workflows, generating distinct character identities and handling complex visual styles for applications ranging from e-commerce product photography to entertainment content.
Deployment and Integration
Microsoft Foundry users can deploy these models directly through the Hugging Face collection in the Foundry model catalog or via one-click deployments from the Hugging Face Hub. The platform handles secure, scalable inference configuration automatically.
Getting started requires:
- Browsing the Hugging Face collection in Foundry
- Selecting supported models for deployment
- Configuring managed endpoints in Azure
- Accessing documentation for implementation guidance
These additions strengthen Microsoft's position in the enterprise AI market by offering production-ready models that push the boundaries of their respective domains while maintaining practical deployment considerations. The integration with Microsoft Foundry provides developers with enterprise-grade infrastructure for deploying these advanced AI capabilities at scale.

For developers looking to implement these models, Microsoft provides comprehensive documentation and a GitHub repository for staying updated on the latest releases and implementation patterns.

Comments
Please log in or register to join the discussion