More

Vibe Coders Showcase Models Community Data Value Compare Calculator Glossary Methodology Install Docs How It Works Help Blog

Sign in Start free

Performance concept illustration

Performance

Throughput

How much work a model can do per unit of time, such as total tokens served per second across all requests. Matters most at scale.

In practice

High throughput lets you serve thousands of users on the same model at once.

Related terms

Tokens per second Latency

See what your tokens really cost

Track usage and spend across every model and platform, free.

Start tracking free See the AI Cost Index

Image: Mathias Reding on Pexels. Definition free to reuse under CC BY 4.0.