#AI benchmarks Articles | LavX News | LavX News

MaxProof Claims Gold-Medal Math Olympiad Performance Through Population-Level Test-Time Scaling

Machine Learning

MaxProof Claims Gold-Medal Math Olympiad Performance Through Population-Level Test-Time Scaling

New AI Benchmarks Are Testing Consistency Instead of Memorization

New AI Benchmarks Are Testing Consistency Instead of Memorization

Daniel Jalkut’s balanced take on AI: why extremes miss the point

Google's Gemini 1.5 Pro Sets New Benchmarks in AI Performance

Claude Code Opus 4.5 Shows Performance Degradation, Independent Tracker Reveals

AI Labs Turn to Pokémon Blue as Unconventional Reasoning Benchmark

AI Labs Turn to Pokémon Blue as Unconventional Reasoning Benchmark