Methodology
How we measure the cost of AI
Every number on MyTokenTracker is built from a named source on a stated cadence, with the formulas written down. This page is the reference a researcher, journalist, or AI engine can read to understand and cite exactly what each dataset means.
Last reviewed June 17, 2026. Basket version v1.
The datasets and their sources
MyTokenTracker publishes five datasets. Each has one primary source and a fixed refresh schedule.
| Dataset | Primary source | Refresh | License |
|---|---|---|---|
| AI Cost Index | Derived from the price catalog below | Daily | CC BY 4.0 |
| Model price catalog | LiteLLM open pricing dataset | Daily | CC BY 4.0 |
| Community usage | Opt-in events from MyTokenTracker users (aggregated) | Live, rolled up every 5 minutes | CC BY 4.0 |
| Quality & speed | Artificial Analysis | Daily | Per source terms, attributed |
| Human preference | LMArena overall leaderboard | Weekly | CC BY 4.0 |
The AI Cost Index
The AI Cost Index is the equal-weighted average blended price, in US dollars per million tokens, of a fixed basket of large language models. It measures how the price of running AI moves over time, holding the model mix constant so you see price movement and not a shifting selection.
The blended price
Every model contributes one blended price using a fixed input-to-output ratio, so the index is comparable across models with very different output prices.
Prices are per million tokens. The 3:1 ratio is a fixed, stated assumption that approximates a read-heavy workload. The index value is the equal-weighted mean of its constituents’ blended prices.
How the baskets work
- Fixed slots. Each provider gets one slot per basket. The basket is frozen as version v1 so the series stays comparable.
- Ordered aliases. Each slot lists candidate model keys; the first one present in the catalog on a given date is used.
- Honest counts. A slot with no match that day is excluded, and the constituent count reflects it.
- Real history. Past points are reconstructed from the catalog’s version history at every date a constituent’s price actually changed.
Frontier basket
4 slotsOne flagship model per major provider
- anthropic
- claude-opus-4-6 · claude-opus-4-1 · claude-opus-4-0 · claude-3-opus-20240229 · claude-3-opus
- openai
- gpt-4o · gpt-4.1 · gpt-4-turbo
- gemini-2.5-pro · gemini-pro-latest · gemini-1.5-pro
- mistral
- mistral/mistral-large-latest · mistral/mistral-large-2512 · mistral-large-latest
Budget basket
5 slotsOne cost-efficient workhorse per major provider
- anthropic
- claude-haiku-4-5 · claude-3-5-haiku · claude-3-5-haiku-latest · claude-3-haiku-20240307
- openai
- gpt-4o-mini · gpt-4.1-mini · gpt-3.5-turbo
- gemini-2.5-flash · gemini-2.0-flash · gemini-1.5-flash
- mistral
- mistral/mistral-small-latest · mistral/mistral-small · mistral-small-latest
- deepseek
- deepseek-chat · deepseek/deepseek-chat
See the live index and download the series on the AI Cost Index page.
The model price catalog
Prices come from the LiteLLM open pricing dataset, the community-maintained, widely-used reference for per-million-token input and output prices across every major provider. We sync it daily and keep a version history so historical analysis stays accurate.
- Normalized units. Everything is converted to US dollars per million tokens so models are directly comparable.
- Versioned. Every price change is recorded with the date it changed, which is what makes the index’s history possible.
- Chat models only, where it matters. Embedding, image, audio, and moderation models are excluded from value and index calculations.
- A daily coverage check flags providers or models that drift out of the catalog so gaps are caught, not hidden.
Community usage
Community figures are built only from the usage that MyTokenTracker users choose to share, and they are always published as anonymized aggregates, never as individual activity.
- Opt-in. A user’s usage is only included if they have explicitly turned on community sharing in their settings.
- Aggregate only. The public dashboards show totals, averages, and distributions across the community, computed on a rolling basis. No row identifies a person, project, or key.
- Token-based, not estimated. Spend is computed from real token counts priced against the catalog, so it reflects measured usage rather than guesses.
Quality, speed, and human preference
Price is only half of value, so we pair it with two independent quality signals from established third parties, always attributed to their source.
Artificial Analysis
Benchmark quality index plus speed measures such as median tokens per second and time to first token. Synced daily and used to rank value for money against price.
LMArena
The crowd-sourced human-preference leaderboard (Chatbot Arena), pulled from its open dataset weekly to show how models rank on real head-to-head votes.
Freshness and integrity
A citation source is only as good as its accuracy and its uptime. Two things keep the data honest:
- Every sync records a heartbeat. Each scheduled job writes a success or failure record, so a silently-stopped or failing pipeline is detected rather than served stale.
- A daily health check watches those heartbeats and the freshness of each dataset, and alerts when anything falls behind.
- No invented numbers. Published figures are computed from the sources above. Where something is an estimate or an assumption, such as the index’s blend ratio, it is stated plainly.
License and reuse
Every MyTokenTracker dataset is published under Creative Commons Attribution 4.0. Use it, share it, and build on it, including commercially. All we ask is attribution and a link back to mytokentracker.io. Read the data live through the open data API (JSON and CSV), or pull daily snapshots from the open-data repository.
How to cite this
Free to use and cite under CC BY 4.0. See how this is measured.
Champlin Enterprises. (2026). MyTokenTracker AI Cost Index [Data set]. MyTokenTracker. Retrieved June 17, 2026, from https://mytokentracker.io/cost-index
@misc{mytokentracker-cost-index,
title = {MyTokenTracker AI Cost Index},
author = {{Champlin Enterprises}},
year = {2026},
howpublished = {MyTokenTracker, \url{https://mytokentracker.io/cost-index}},
note = {Accessed June 17, 2026. Licensed CC BY 4.0.},
url = {https://mytokentracker.io/cost-index}
}
Need a fixed point in time? Every day’s data is permanently archived in the open-data repository, so you can cite a specific date by linking that day’s committed file.