Benchmark
A standardized test used to measure and compare AI models on tasks like coding, math, or reasoning.
Why it matters
Benchmarks drive the headlines, but they can be gamed, so real-world use matters more than a leaderboard score.
Related terms
Back to the full AI glossary.