More

Vibe Coders Showcase Models Community Data Value Compare Calculator Glossary Methodology Install Docs How It Works Help Blog

Sign in Start free

Decoding & sampling concept illustration

Decoding & sampling

Max tokens

A cap on how many tokens the model is allowed to generate in one response. It bounds both the length and the cost of an answer.

In practice

Set max tokens to 300 so a summary cannot run long and run up the bill.

Related terms

Output tokens Completion

See what your tokens really cost

Track usage and spend across every model and platform, free.

Start tracking free See the AI Cost Index

Image: Codioful on Pexels. Definition free to reuse under CC BY 4.0.