Tools & ecosystem concept illustration

Tools & ecosystem

Benchmark

A standardized test used to measure and compare model quality on tasks like reasoning, coding, or knowledge.

In practice

A model that scores higher on a coding benchmark may still cost more per token.

Related terms

See what your tokens really cost

Track usage and spend across every model and platform, free.

Image: Godfrey Atima on Pexels. Definition free to reuse under CC BY 4.0.