OpenAI GPT-5.4 Thinking: Six Major Upgrades Powering the Next Generation of AI Agents

OpenAI's GPT-5.4 Thinking model delivers major improvements in coding, document analysis, tool use, and agent workflows with a 1M token context window and built-in computer use capabilities.

OpenAI has unveiled GPT-5.4 Thinking, the latest evolution of its flagship AI model that brings six significant improvements across coding, document processing, tool integration, and agent capabilities. The upgrade, which is rolling out gradually to ChatGPT and Codex users, represents OpenAI's most comprehensive enhancement since the December GPT-5.2 release.

Six Key Areas of Improvement

OpenAI has focused GPT-5.4's development on six critical areas where AI agents need the most advancement:

Coding, Document Understanding, Tool Use, and Instruction Following

The model shows substantial improvements in understanding complex codebases and technical documentation. GPT-5.4 can now parse entire repositories in a single request, making it particularly valuable for developers working with large codebases or legacy systems.

Image Perception and Multimodal Tasks

Enhanced visual processing allows GPT-5.4 to better interpret images, charts, and diagrams alongside text. This multimodal capability is crucial for tasks that require understanding both visual and textual information simultaneously.

Long-Running Task Execution and Multi-Step Agent Workflows

GPT-5.4 excels at maintaining context over extended interactions, enabling more complex agent workflows that require multiple steps and decision points. This improvement addresses one of the major limitations of previous models in sustained task execution.

Token Efficiency and End-to-End Performance on Tool-Heavy Workloads

The model demonstrates better optimization when using external tools, reducing the computational overhead typically associated with tool integration. This makes agent workflows more efficient and responsive.

Agentic Web Search and Multi-Source Synthesis

GPT-5.4 shows marked improvement in searching across multiple sources and synthesizing information, particularly for hard-to-find data. This capability is essential for research tasks and complex information gathering.

Document-Heavy and Spreadsheet-Heavy Business Workflows

The model handles business-critical documents and spreadsheets with greater accuracy, making it more suitable for customer service, analytics, and financial applications where precision is paramount.

Revolutionary Context Window and Computer Use

Perhaps the most significant technical advancement is GPT-5.4's expanded 1 million token context window. This massive increase allows the model to analyze entire codebases, extensive document collections, or complex agent trajectories in a single request—a capability that fundamentally changes what's possible with AI agents.

OpenAI is also touting GPT-5.4 as its first "mainline model" with built-in computer use capabilities. This means agents can now interact directly with software applications, completing tasks through a build-run-verify-fix loop without requiring external tool integration. This native computer use represents a significant step toward more autonomous AI systems.

Compaction Support for Longer Agent Trajectories

The model introduces "compaction" support, a feature that enables longer agent trajectories while preserving key context. This technical advancement allows agents to maintain coherence over extended interactions and complex workflows that would previously have exceeded the model's capacity.

Availability and Subscription Tiers

GPT-5.4 Thinking is available immediately for Plus, Team, and Pro subscribers in ChatGPT and Codex. The model will gradually replace GPT-5.2 Thinking, which OpenAI has announced will be deprecated within three months.

This phased rollout allows OpenAI to monitor performance and gather user feedback before making the model universally available. The gradual deployment also helps manage the computational resources required for such a capable model.

The Competitive Landscape

The timing of this release is significant, coming just days after the GPT-5.3 Instant launch and following OpenAI's "code red" response to Google's Gemini in December. The rapid iteration cycle demonstrates OpenAI's commitment to maintaining its technological lead in the competitive AI landscape.

GPT-5.4's improvements in coding and document understanding directly address areas where enterprise users have demanded better performance. The enhanced tool use and agent capabilities position OpenAI to compete more effectively in the growing market for autonomous AI agents.

Practical Applications

For developers, GPT-5.4 offers the ability to work with entire codebases without context switching, potentially revolutionizing how AI assists in software development. The improved document understanding makes it particularly valuable for legal, financial, and research applications where accuracy with complex documents is critical.

The built-in computer use capabilities open new possibilities for automation, allowing AI agents to interact with software interfaces directly rather than through API calls or custom integrations.

Looking Forward

GPT-5.4 represents a significant step toward more capable and autonomous AI agents. The combination of expanded context windows, improved tool use, and native computer interaction capabilities suggests OpenAI is positioning itself for the next wave of AI applications that go beyond simple chat interfaces.

As the model rolls out to users, we'll likely see new use cases emerge that take advantage of these capabilities in ways that weren't possible with previous versions. The three-month deprecation timeline for GPT-5.2 also indicates OpenAI's confidence in the stability and capability of this new release.

For current ChatGPT subscribers, the upgrade should arrive automatically in their accounts, with the most capable features available to Pro users who need the highest performance for complex tasks.

Learn more about GPT-5.4 from OpenAI's official announcement