Fundamentals
Inference
Running a trained model to get an answer, as opposed to training it. Every API call you make is an inference. This is what providers charge you for per token.
In practice
Asking a model to draft a reply is inference. The one-time cost of building the model was training.
Related terms
See what your tokens really cost
Track usage and spend across every model and platform, free.
Image: Google DeepMind on Pexels. Definition free to reuse under CC BY 4.0.