Performance concept illustration

Performance

Tokens per second

TPS

How fast a model streams out its answer once it starts. Higher means the text appears more quickly.

In practice

At 80 tokens per second, a 400-word answer finishes in about six seconds.

Related terms

See what your tokens really cost

Track usage and spend across every model and platform, free.

Image: Mathias Reding on Pexels. Definition free to reuse under CC BY 4.0.