More

Vibe Coders Showcase Models Community Data Value Compare Calculator Glossary Methodology Install Docs How It Works Help Blog

Sign in Start free

Performance concept illustration

Performance

Latency

How long you wait for a response. Lower latency feels snappier; it depends on the model, the length of the answer, and current load.

In practice

A small model answers in under a second; a large reasoning model may take many seconds.

Related terms

Time to first token Tokens per second Throughput

See what your tokens really cost

Track usage and spend across every model and platform, free.

Start tracking free See the AI Cost Index

Image: Mathias Reding on Pexels. Definition free to reuse under CC BY 4.0.