Performance concept illustration

Performance

Latency

How long you wait for a response. Lower latency feels snappier; it depends on the model, the length of the answer, and current load.

In practice

A small model answers in under a second; a large reasoning model may take many seconds.

Related terms

See what your tokens really cost

Track usage and spend across every model and platform, free.

Image: Mathias Reding on Pexels. Definition free to reuse under CC BY 4.0.