Search Results: GDPval

OpenAI's GDPval Benchmark Reveals AI's Real-World Strengths: Claude Edges Out GPT-5 in Key Areas

September 25, 2025 3 min read

OpenAI's new GDPval evaluation framework tests top AI models on economically significant workplace tasks, revealing Claude Opus excels in aesthetics while GPT-5 dominates accuracy. The study exposes AI's 100x cost advantage over humans but highlights critical limitations in assessing contextual understanding.