Training & tuning concept illustration

Training & tuning

Reinforcement learning from human feedback

RLHF

A tuning method where humans rank model outputs and the model learns to prefer the highly-rated ones. It is a big reason chat models feel helpful and polite.

In practice

RLHF teaches a model to refuse harmful requests and answer the way people prefer.

Related terms

See what your tokens really cost

Track usage and spend across every model and platform, free.

Image: panumas nikhomkhai on Pexels. Definition free to reuse under CC BY 4.0.