OpenAI's GPT-image-2 model launches on Microsoft Foundry with 4K resolution, multilingual support, and intelligent routing for production workflows.
Microsoft has announced the general availability of OpenAI's GPT-image-2 on Microsoft Foundry, marking a significant upgrade for enterprise teams needing high-quality, scalable image generation capabilities.

The Enterprise Challenge: Scaling Visual Content Creation
Consider a small design team managing a global social media campaign. They need to produce localized imagery for every market, but lack the resources to reshoot, reformat, or outsource at scale. Every asset must fit different platforms, dimensions, and cultural contexts—all while meeting tight deadlines.
This is the exact problem GPT-image-2 aims to solve. By bringing advanced image generation directly into Microsoft Foundry, developers and designers can now execute campaigns with the reach and flexibility of much larger teams.
What's New in GPT-image-2
Real-World Intelligence
GPT-image-2 features a knowledge cutoff of December 2025, providing more contextually relevant and accurate outputs. The model includes enhanced thinking capabilities that allow it to search the web, verify its own outputs, and generate multiple images from a single prompt. This transforms image generation from a simple tool into a creative sidekick.
Multilingual Understanding
The model now supports Japanese, Korean, Chinese, Hindi, and Bengali, with improved thinking capabilities for localized content creation. This means teams can generate images and render text that feels genuinely localized rather than translated.
4K Resolution Support
GPT-image-2 introduces 4K resolution capabilities, enabling developers to generate rich, detailed, and photorealistic images at custom dimensions. Key constraints to note:
- Total pixel budget: Maximum 8,294,400 pixels per image
- Minimum pixels: 655,360 pixels per image
- Automatic resizing: Requests exceeding limits are resized automatically
- Supported resolutions: 4K, 1024x1024, 1536x1024, 1024x1536
- Dimension alignment: Each dimension must be a multiple of 16
Intelligent Routing Layer
Perhaps the most innovative feature is the intelligent routing layer with two distinct modes:
Mode 1 - Legacy Size Selection: Automatically selects from three legacy size tiers (small, standard, large) without manual changes.
Mode 2 - Token Size Bucket Selection: Chooses from six token size buckets (16, 24, 36, 48, 64, 96) that map to legacy tiers, offering more flexibility in token generation for optimized output quality and efficiency.
Real-World Performance
Microsoft demonstrated GPT-image-2's improvements using a consistent prompt: "Interior of an empty subway car (no people). Wide-angle view looking down the aisle. Clean, modern subway car with seats, poles, route map strip, and ad frames above the windows. Realistic lighting with a slight cool fluorescent tone, realistic materials (metal poles, vinyl seats, textured floor)."

Comparing outputs across GPT-image-1, 1.5, and 2.0 shows clear improvements in image quality and realism with each iteration.

The model also excels at iterative refinement. Starting with the subway car, users can add a "Zava Flower Delivery" ad campaign, then refine it to show only roses:

Three simple prompts transformed a basic scene into a complete marketing mockup.
Industry Applications
These capabilities unlock production-ready image generation workflows across multiple sectors:
Retail & E-commerce: Generate platform-specific product imagery without post-processing Marketing: Create localized campaign visuals and social assets Media & Entertainment: Generate storyboard panels and scenes at production-ready resolutions Education & Training: Create visual learning aids formatted for specific devices UI/UX Design: Accelerate mockup workflows with interface assets at precise design system dimensions
Trust and Safety
Microsoft emphasizes responsible AI deployment. GPT-image-2 undergoes internal reviews and includes safeguards for enterprise use. The deployment combines OpenAI's image generation safety mitigations with Azure AI Content Safety, including filters and classifiers for sensitive content.
Pricing and Availability
GPT-image-2 is available through Microsoft Foundry with the following pricing structure (per 1M tokens):
- Image Generation: Input $8, Cached Input $2, Output $30
- Text Generation: Input $5, Cached Input $1.25, Output $10
Getting Started
Teams can immediately begin using GPT-image-2 through Microsoft Foundry by:
- Deploying the model directly in Microsoft Foundry
- Experimenting with the Image Playground
- Consulting the comprehensive documentation
The rollout represents Microsoft's continued investment in bringing cutting-edge AI capabilities to enterprise workflows, addressing real-world content creation challenges at scale.
For teams struggling with visual content production across markets and platforms, GPT-image-2 offers a compelling solution that combines quality, flexibility, and enterprise-grade safety controls.

Comments
Please log in or register to join the discussion