Where does MyTokenTracker pricing data come from?

Model prices come from the LiteLLM open pricing dataset, the canonical community-maintained source of per-million-token input and output prices, synced daily and version-tracked so historical prices stay accurate.

How is the AI Cost Index calculated?

The AI Cost Index is the equal-weighted average blended price per million tokens of a fixed basket of models. Each model contributes a blended price using a fixed 3:1 input-to-output ratio: blended = (3 x input + 1 x output) / 4. The basket is frozen as version v1 so the series stays comparable over time.

Is MyTokenTracker data free to use and cite?

Yes. Every dataset is published under Creative Commons Attribution 4.0 (CC BY 4.0). You can use, share, and build on it commercially, with attribution and a link back to mytokentracker.io.

Methodology — How MyTokenTracker Measures AI Cost

The datasets and their sources

MyTokenTracker publishes five datasets. Each has one primary source and a fixed refresh schedule.

Dataset	Primary source	Refresh	License
AI Cost Index	Derived from the price catalog below	Daily	CC BY 4.0
Model price catalog	LiteLLM open pricing dataset	Daily	CC BY 4.0
Community usage	Opt-in events from MyTokenTracker users (aggregated)	Live, rolled up every 5 minutes	CC BY 4.0
Quality & speed	Artificial Analysis	Daily	Per source terms, attributed
Human preference	LMArena overall leaderboard	Weekly	CC BY 4.0

The AI Cost Index

The AI Cost Index is the equal-weighted average blended price, in US dollars per million tokens, of a fixed basket of large language models. It measures how the price of running AI moves over time, holding the model mix constant so you see price movement and not a shifting selection.

The blended price

Every model contributes one blended price using a fixed input-to-output ratio, so the index is comparable across models with very different output prices.

blended = (3 × input + 1 × output) / 4

Prices are per million tokens. The 3:1 ratio is a fixed, stated assumption that approximates a read-heavy workload. The index value is the equal-weighted mean of its constituents’ blended prices.

How the baskets work

Fixed slots. Each provider gets one slot per basket. The basket is frozen as version v1 so the series stays comparable.
Ordered aliases. Each slot lists candidate model keys; the first one present in the catalog on a given date is used.
Honest counts. A slot with no match that day is excluded, and the constituent count reflects it.
Real history. Past points are reconstructed from the catalog’s version history at every date a constituent’s price actually changed.

Frontier basket

4 slots

One flagship model per major provider

anthropic: claude-opus-4-6 · claude-opus-4-1 · claude-opus-4-0 · claude-3-opus-20240229 · claude-3-opus
openai: gpt-4o · gpt-4.1 · gpt-4-turbo
google: gemini-2.5-pro · gemini-pro-latest · gemini-1.5-pro
mistral: mistral/mistral-large-latest · mistral/mistral-large-2512 · mistral-large-latest

Budget basket

5 slots

One cost-efficient workhorse per major provider

anthropic: claude-haiku-4-5 · claude-3-5-haiku · claude-3-5-haiku-latest · claude-3-haiku-20240307
openai: gpt-4o-mini · gpt-4.1-mini · gpt-3.5-turbo
google: gemini-2.5-flash · gemini-2.0-flash · gemini-1.5-flash
mistral: mistral/mistral-small-latest · mistral/mistral-small · mistral-small-latest
deepseek: deepseek-chat · deepseek/deepseek-chat

See the live index and download the series on the AI Cost Index page.

The model price catalog

Prices come from the LiteLLM open pricing dataset, the community-maintained, widely-used reference for per-million-token input and output prices across every major provider. We sync it daily and keep a version history so historical analysis stays accurate.

Normalized units. Everything is converted to US dollars per million tokens so models are directly comparable.
Versioned. Every price change is recorded with the date it changed, which is what makes the index’s history possible.
Chat models only, where it matters. Embedding, image, audio, and moderation models are excluded from value and index calculations.
A daily coverage check flags providers or models that drift out of the catalog so gaps are caught, not hidden.

Community usage

Community figures are built only from the usage that MyTokenTracker users choose to share, and they are always published as anonymized aggregates, never as individual activity.

Opt-in. A user’s usage is only included if they have explicitly turned on community sharing in their settings.
Aggregate only. The public dashboards show totals, averages, and distributions across the community, computed on a rolling basis. No row identifies a person, project, or key.
Token-based, not estimated. Spend is computed from real token counts priced against the catalog, so it reflects measured usage rather than guesses.

Quality, speed, and human preference

Price is only half of value, so we pair it with two independent quality signals from established third parties, always attributed to their source.

Artificial Analysis

Benchmark quality index plus speed measures such as median tokens per second and time to first token. Synced daily and used to rank value for money against price.

LMArena

The crowd-sourced human-preference leaderboard (Chatbot Arena), pulled from its open dataset weekly to show how models rank on real head-to-head votes.

Freshness and integrity

A citation source is only as good as its accuracy and its uptime. Two things keep the data honest:

Every sync records a heartbeat. Each scheduled job writes a success or failure record, so a silently-stopped or failing pipeline is detected rather than served stale.
A daily health check watches those heartbeats and the freshness of each dataset, and alerts when anything falls behind.
No invented numbers. Published figures are computed from the sources above. Where something is an estimate or an assumption, such as the index’s blend ratio, it is stated plainly.

License and reuse

Every MyTokenTracker dataset is published under Creative Commons Attribution 4.0. Use it, share it, and build on it, including commercially. All we ask is attribution and a link back to mytokentracker.io. Read the data live through the open data API (JSON and CSV), or pull daily snapshots from the open-data repository.

How to cite this

Free to use and cite under CC BY 4.0. See how this is measured.

APA

Champlin Enterprises. (2026). MyTokenTracker AI Cost Index [Data set]. MyTokenTracker. Retrieved August 1, 2026, from https://mytokentracker.io/cost-index

BibTeX

@misc{mytokentracker-cost-index,
  title        = {MyTokenTracker AI Cost Index},
  author       = {{Champlin Enterprises}},
  year         = {2026},
  howpublished = {MyTokenTracker, \url{https://mytokentracker.io/cost-index}},
  note         = {Accessed August 1, 2026. Licensed CC BY 4.0.},
  url          = {https://mytokentracker.io/cost-index}
}

Need a fixed point in time? Every day’s data is permanently archived in the open-data repository, so you can cite a specific date by linking that day’s committed file.

How we measure the cost of AI