The cheapest LLM APIs, right now.
Verified token pricing from every major provider — sorted, filtered, and visualized. No fabricated numbers: every price links back to its official source.
| Model | Provider | Input $/Mtok | Output $/Mtok | Blended | Relative cost | Context | Notes |
|---|---|---|---|---|---|---|---|
| GLM-4.7-Flash | Zhipu | $0.000 | $0.000 | $0.000 |
128K | Open | |
| Rerank 3.5 | Cohere | $0.020 | $0.020 | $0.020 |
— | ||
| text-embedding-3-small | OpenAI | $0.020 | $0.020 | $0.020 |
8K | ||
| rerank-2.5-lite | Voyage AI | $0.020 | $0.020 | $0.020 |
— | ||
| voyage-4-lite | Voyage AI | $0.020 | $0.020 | $0.020 |
— | ||
| text-embedding-004 | $0.025 | $0.025 | $0.025 |
2K | |||
| GLM-OCR | Zhipu | $0.030 | $0.030 | $0.030 |
128K | Open | |
| rerank-2.5 | Voyage AI | $0.050 | $0.050 | $0.050 |
— | ||
| voyage-4 | Voyage AI | $0.060 | $0.060 | $0.060 |
— | ||
| Granite 4.0 Micro | IBM | $0.017 | $0.112 | $0.065 |
128K | Open | |
| Llama 3.1 8B | Meta | $0.050 | $0.080 | $0.065 |
128K | Open | |
| Baichuan M2-32B | Baichuan | $0.070 | $0.070 | $0.070 |
33K | Open | |
| LFM2 24B A2B | Together | $0.030 | $0.120 | $0.075 |
128K | Open | |
| Nova Micro | Amazon | $0.035 | $0.140 | $0.088 |
128K | ||
| Ministral 3 3B | Mistral | $0.100 | $0.100 | $0.100 |
128K | Open | |
| Pixtral 12B | Mistral | $0.100 | $0.100 | $0.100 |
128K | Open | |
| Nemotron 70B Instruct | NVIDIA | $0.100 | $0.100 | $0.100 |
128K | Open | |
| Reka Edge | Reka | $0.100 | $0.100 | $0.100 |
66K | ||
| GLM-4-32B-0414 | Zhipu | $0.100 | $0.100 | $0.100 |
128K | Open | |
| Granite Embedding 278M Multilingual | IBM | $0.106 | $0.106 | $0.106 |
— | Open | |
| Embed 4 | Cohere | $0.120 | $0.120 | $0.120 |
— | ||
| voyage-4-large | Voyage AI | $0.120 | $0.120 | $0.120 |
— | ||
| voyage-multimodal-3.5 | Voyage AI | $0.120 | $0.120 | $0.120 |
— | ||
| Qwen-Turbo | Alibaba | $0.050 | $0.200 | $0.125 |
1M | ||
| text-embedding-3-large | OpenAI | $0.130 | $0.130 | $0.130 |
8K | ||
| Mistral Small 3.2 24B | Mistral | $0.080 | $0.200 | $0.140 |
128K | Open | |
| Nova Lite | Amazon | $0.060 | $0.240 | $0.150 |
300K | ||
| Granite 4 H Small | IBM | $0.060 | $0.250 | $0.155 |
128K | Open | |
| voyage-code-3 | Voyage AI | $0.180 | $0.180 | $0.180 |
32K | ||
| voyage-context-3 | Voyage AI | $0.180 | $0.180 | $0.180 |
32K | ||
| GPT OSS 20B | Fireworks | $0.070cached $0.035 | $0.300 | $0.185 |
128K | Open | |
| Gemini 2.5 Flash | $0.075 | $0.300 | $0.188 |
1M | |||
| Voxtral Small 24B | Mistral | $0.100 | $0.300 | $0.200 |
128K | Open | |
| DeepSeek V4 Flash | DeepSeek | $0.140cached $0.003 | $0.280 | $0.210 |
1M | ||
| DeepSeek V4 Flash | Fireworks | $0.140cached $0.028 | $0.280 | $0.210 |
1M | ||
| Llama 4 Scout | Meta | $0.110 | $0.340 | $0.225 |
10M | Open | |
| GLM-4.7-FlashX | Zhipu | $0.070cached $0.010 | $0.400 | $0.235 |
128K | Open | |
| Gemini 2.5 Flash-Lite | $0.100 | $0.400 | $0.250 |
1M | |||
| GPT-4.1 nano | OpenAI | $0.100cached $0.050 | $0.400 | $0.250 |
1M | ||
| Qwen-Flash | Alibaba | $0.115 | $0.460 | $0.288 |
1M | ||
| Jamba Mini | AI21 Labs | $0.200 | $0.400 | $0.300 |
256K | Open | |
| Grok 4.1 Fast | xAI | $0.200 | $0.500 | $0.350 |
2M | ||
| Command R 08-2024 | Cohere | $0.150 | $0.600 | $0.375 |
128K | ||
| GPT OSS 120B | Fireworks | $0.150cached $0.015 | $0.600 | $0.375 |
128K | Open | |
| Granite 4 H Medium | IBM | $0.150 | $0.600 | $0.375 |
128K | Open | |
| Mistral Small 4 | Mistral | $0.150 | $0.600 | $0.375 |
128K | ||
| gpt-oss-120B | Together | $0.150 | $0.600 | $0.375 |
128K | Open | |
| Llama 3.1 8B Instant | Groq | $0.050 | $1.00 | $0.525 |
128K | 840 TPS Open | |
| GPT-OSS 20B | Groq | $0.075 | $1.00 | $0.537 |
128K | 1000 TPS Open | |
| Mixtral 8x7B Instruct | Mistral | $0.540 | $0.540 | $0.540 |
32K | Open | |
| Llama 4 Scout | Groq | $0.110 | $1.00 | $0.555 |
10M | 594 TPS Open | |
| Gemma-4-31B-it-Pearl | Together | $0.280 | $0.860 | $0.570 |
128K | Open | |
| GPT-OSS 120B | Groq | $0.150 | $1.00 | $0.575 |
128K | 500 TPS Open | |
| Codestral | Mistral | $0.300 | $0.900 | $0.600 |
256K | ||
| Codestral 2508 | Mistral | $0.300 | $0.900 | $0.600 |
256K | ||
| GLM-4.6V | Zhipu | $0.300cached $0.050 | $0.900 | $0.600 |
128K | Open | |
| Qwen3 32B | Groq | $0.290 | $1.00 | $0.645 |
128K | 662 TPS Open | |
| GLM-4.5-Air | Zhipu | $0.200cached $0.030 | $1.10 | $0.650 |
128K | Open | |
| DeepSeek V4 Pro | DeepSeek | $0.435cached $0.004 | $0.870 | $0.652 |
1M | ||
| Llama 3.3 70B | Meta | $0.590 | $0.790 | $0.690 |
128K | Open | |
| MiniMax 2.5 | Fireworks | $0.300cached $0.030 | $1.20 | $0.750 |
128K | Open | |
| MiniMax 2.7 | Fireworks | $0.300cached $0.060 | $1.20 | $0.750 |
128K | Open | |
| MiniMax M3 | Fireworks | $0.300cached $0.060 | $1.20 | $0.750 |
1M | Open | |
| Granite 4 H Large | IBM | $0.300 | $1.20 | $0.750 |
128K | Open | |
| MiniMax-M2 | MiniMax | $0.300cached $0.030 | $1.20 | $0.750 |
205K | Open | |
| MiniMax-M2.1 | MiniMax | $0.300cached $0.030 | $1.20 | $0.750 |
205K | Open | |
| MiniMax-M2.5 | MiniMax | $0.300cached $0.030 | $1.20 | $0.750 |
205K | Open | |
| MiniMax-M2.7 | MiniMax | $0.300cached $0.060 | $1.20 | $0.750 |
205K | Open | |
| MiniMax-M3 | MiniMax | $0.300cached $0.060 | $1.20 | $0.750 |
1M | Open | |
| MiniMax M2.5 | Together | $0.300cached $0.060 | $1.20 | $0.750 |
128K | Open | |
| MiniMax M3 | Together | $0.300cached $0.060 | $1.20 | $0.750 |
1M | Open | |
| Llama 3.3 70B Versatile | Groq | $0.590 | $1.00 | $0.795 |
128K | 394 TPS Open | |
| Qwen-Plus | Alibaba | $0.400 | $1.20 | $0.800 |
131K | ||
| Qwen 3.6 27B | Groq | $0.600 | $1.00 | $0.800 |
128K | 500 TPS Open | |
| Qwen3.7-Plus | Together | $0.320 | $1.28 | $0.800 |
128K | Open | |
| Gemini 3.1 Flash-Lite | $0.250 | $1.50 | $0.875 |
1M | |||
| Command R 03-2024 | Cohere | $0.500 | $1.50 | $1.00 |
128K | ||
| Qwen 3.7 Plus | Fireworks | $0.400cached $0.080 | $1.60 | $1.00 |
128K | Open | |
| Mistral Large 3 | Mistral | $0.500 | $1.50 | $1.00 |
128K | ||
| GPT-4.1 mini | OpenAI | $0.400cached $0.200 | $1.60 | $1.00 |
1M | ||
| Sonar | Perplexity | $1.00 | $1.00 | $1.00 |
200K | ||
| Llama 3.3 70B | Together | $1.04 | $1.04 | $1.04 |
128K | Open | |
| Devstral 2 2512 | Mistral | $0.400 | $2.00 | $1.20 |
256K | Open | |
| Mistral Medium 3 | Mistral | $0.400 | $2.00 | $1.20 |
128K | ||
| Gemini 3.1 Flash | $0.300 | $2.50 | $1.40 |
1M | |||
| Reka Flash | Reka | $0.800 | $2.00 | $1.40 |
128K | ||
| GLM-4.5 | Zhipu | $0.600cached $0.110 | $2.20 | $1.40 |
128K | Open | |
| GLM-4.6 | Zhipu | $0.600cached $0.110 | $2.20 | $1.40 |
128K | Open | |
| GLM-4.7 | Zhipu | $0.600cached $0.110 | $2.20 | $1.40 |
128K | Open | |
| Command A | Cohere | $1.00 | $2.00 | $1.50 |
256K | ||
| NVIDIA Nemotron 3 Ultra | Fireworks | $0.600cached $0.120 | $2.40 | $1.50 |
128K | Open | |
| Grok Build 0.1 | xAI | $1.00 | $2.00 | $1.50 |
256K | ||
| QwQ-Plus | Alibaba | $0.800 | $2.40 | $1.60 |
131K | ||
| Qwen 3.6 Plus | Fireworks | $0.500cached $0.100 | $3.00 | $1.75 |
128K | Open | |
| Kimi K2.5 | Fireworks | $0.600cached $0.100 | $3.00 | $1.80 |
256K | Open | |
| Kimi K2.5 | Moonshot | $0.600cached $0.100 | $3.00 | $1.80 |
262K | Open | |
| Grok 4.3 | xAI | $1.25 | $2.50 | $1.88 |
1M | ||
| Nova Pro | Amazon | $0.800 | $3.20 | $2.00 |
300K | ||
| Llama Nemotron Ultra 253B | NVIDIA | $0.600 | $3.60 | $2.10 |
128K | Open | |
| Nemotron 3 Ultra | NVIDIA | $0.600cached $0.120 | $3.60 | $2.10 |
128K | Open | |
| NVIDIA Nemotron 3 Ultra | Together | $0.600cached $0.200 | $3.60 | $2.10 |
128K | Open | |
| Qwen3.5-397B-A17B | Together | $0.600cached $0.350 | $3.60 | $2.10 |
128K | Open | |
| GLM-5 | Zhipu | $1.00cached $0.200 | $3.20 | $2.10 |
128K | Open | |
| Kimi K2.6 | Fireworks | $0.950cached $0.160 | $4.00 | $2.48 |
256K | Open | |
| Kimi K2.7 Code | Fireworks | $0.950cached $0.190 | $4.00 | $2.48 |
256K | Open | |
| Kimi K2.6 | Moonshot | $0.950cached $0.160 | $4.00 | $2.48 |
262K | Open | |
| Kimi K2.7 Code | Moonshot | $0.950cached $0.190 | $4.00 | $2.48 |
262K | Open | |
| Kimi K2.7 Code | Together | $0.950cached $0.190 | $4.00 | $2.48 |
256K | Open | |
| Qwen3.7-Max | Together | $1.25cached $0.130 | $3.75 | $2.50 |
128K | Open | |
| GLM-5-Turbo | Zhipu | $1.20cached $0.240 | $4.00 | $2.60 |
128K | Open | |
| GLM-5V-Turbo | Zhipu | $1.20cached $0.240 | $4.00 | $2.60 |
128K | Open | |
| DeepSeek V4 Pro | Fireworks | $1.74cached $0.145 | $3.48 | $2.61 |
1M | ||
| DeepSeek V4 Pro | Together | $1.74cached $0.200 | $3.48 | $2.61 |
1M | ||
| GPT-5.4 mini | OpenAI | $0.750cached $0.075 | $4.50 | $2.63 |
1M | ||
| o3-mini | OpenAI | $1.10cached $0.550 | $4.40 | $2.75 |
200K | ||
| o4-mini | OpenAI | $1.10cached $0.550 | $4.40 | $2.75 |
200K | ||
| GLM 5.1 | Fireworks | $1.40cached $0.260 | $4.40 | $2.90 |
128K | Open | |
| GLM 5.2 | Fireworks | $1.40cached $0.260 | $4.40 | $2.90 |
128K | Open | |
| GLM-5.2 | Together | $1.40cached $0.260 | $4.40 | $2.90 |
128K | Open | |
| GLM-5.1 | Zhipu | $1.40cached $0.260 | $4.40 | $2.90 |
128K | Open | |
| GLM-5.2 | Zhipu | $1.40cached $0.260 | $4.40 | $2.90 |
128K | Open | |
| Qwen3-Max | Alibaba | $1.20 | $4.80 | $3.00 |
262K | ||
| Claude Haiku 4.5 | Anthropic | $1.00cached $0.100 | $5.00 | $3.00 |
200K | ||
| Magistral Medium | Mistral | $2.00 | $5.00 | $3.50 |
128K | ||
| Mixtral 8x22B Instruct | Mistral | $2.00 | $6.00 | $4.00 |
64K | Open | |
| Pixtral Large 2411 | Mistral | $2.00 | $6.00 | $4.00 |
128K | Open | |
| Reka Core | Reka | $2.00 | $6.00 | $4.00 |
128K | ||
| Grok 4.20 | xAI | $2.00 | $6.00 | $4.00 |
256K | ||
| Mistral Medium 3.5 | Mistral | $1.50 | $7.50 | $4.50 |
128K | ||
| Kimi K2.7 Code HighSpeed | Moonshot | $1.90cached $0.380 | $8.00 | $4.95 |
262K | Open | |
| Jamba Large | AI21 Labs | $2.00 | $8.00 | $5.00 |
256K | Open | |
| GPT-4.1 | OpenAI | $2.00cached $0.500 | $8.00 | $5.00 |
1M | ||
| Sonar Deep Research | Perplexity | $2.00 | $8.00 | $5.00 |
200K | ||
| Sonar Reasoning Pro | Perplexity | $2.00 | $8.00 | $5.00 |
200K | ||
| Gemini 3.5 Flash | $1.50 | $9.00 | $5.25 |
1M | |||
| Gemini 2.5 Pro | $1.25 | $10.00 | $5.63 |
2M | |||
| Yi Large | 01.AI | $3.00 | $9.00 | $6.00 |
32K | ||
| Command R+ 08-2024 | Cohere | $2.50 | $10.00 | $6.25 |
128K | ||
| Gemini 3.1 Pro | $2.00 | $12.00 | $7.00 |
2M | |||
| Nova Premier | Amazon | $2.50 | $12.50 | $7.50 |
1M | ||
| GPT-5.4 | OpenAI | $2.50cached $0.250 | $15.00 | $8.75 |
1M | ||
| Claude Sonnet 4.6 | Anthropic | $3.00cached $0.300 | $15.00 | $9.00 |
200K | ||
| Sonar Pro | Perplexity | $3.00 | $15.00 | $9.00 |
200K | ||
| Grok 4 | xAI | $3.00 | $15.00 | $9.00 |
256K | ||
| GPT-Realtime-2 | OpenAI | $4.00cached $0.400 | $24.00 | $14.00 |
128K | ||
| Claude Opus 4.5 | Anthropic | $5.00cached $0.500 | $25.00 | $15.00 |
200K | ||
| Claude Opus 4.6 | Anthropic | $5.00cached $0.500 | $25.00 | $15.00 |
200K | ||
| Claude Opus 4.7 | Anthropic | $5.00cached $0.500 | $25.00 | $15.00 |
200K | ||
| Claude Opus 4.8 | Anthropic | $5.00cached $0.500 | $25.00 | $15.00 |
200K | ||
| Claude Mythos 5 | Anthropic | $10.00 | $20.00 | $15.00 |
200K | ||
| GPT-5.5 | OpenAI | $5.00cached $0.500 | $30.00 | $17.50 |
270K | ||
| GPT-Image-2 | OpenAI | $8.00cached $2.00 | $30.00 | $19.00 |
128K | ||
| Claude Fable 5 | Anthropic | $10.00cached $1.00 | $50.00 | $30.00 |
200K |
Understanding the table
Blended cost is the average of input and output price per 1M tokens — a quick way to compare models when your usage is a mix of both. The colored bar shows each model's blended cost relative to the most expensive model in the table.
Green = cheap (<$1/Mtok blended) · Gold = mid ($1–$15) · Red = expensive (>$15, typically frontier models). Cached input prices (where available) are shown under the input price — caching can cut costs 50–98%.
Hosting providers like Groq and Together list the same open-weight models at their own prices, so you can compare e.g. Llama 3.3 70B on Groq ($0.59/$0.79) vs. Together ($1.04/$1.04). The TPS badge marks Groq's inference throughput.
Use the Cost Calculator to estimate your monthly spending, or the Compare page to side-by-side any models.