Search: TechTesting

ChatGPT Agent Put to the Test: One Brilliant Spark Amidst a Sea of Hallucinations

July 21, 2025 4 min read

ZDNET's exhaustive 12-hour evaluation of OpenAI's ChatGPT Agent reveals a tool struggling with reliability, plagued by hallucinations and execution flaws across most tasks. While it stumbled on shopping comparisons, data scraping, and presentation design, a lone success in municipal code analysis hints at its transformative potential—if it can overcome fundamental accuracy hurdles.

Search Results: TechTesting