OpenAI's new GDPval evaluation framework tests top AI models on economically significant workplace tasks, revealing Claude Opus excels in aesthetics while GPT-5 dominates accuracy. The study exposes AI's 100x cost advantage over humans but highlights critical limitations in assessing contextual understanding.