OpenAI's GDPval Benchmark Reveals AI's Real-World Strengths: Claude Edges Out GPT-5 in Key Areas
OpenAI's new GDPval evaluation framework tests top AI models on economically significant workplace tasks, revealing Claude Opus excels in aesthetics while GPT-5 dominates accuracy. The study exposes AI's 100x cost advantage over humans but highlights critical limitations in assessing contextual understanding.