Pricing / Open Weights
Open Weight Models
77 open-weight models hosted across inference providers. Compare hosting prices — the model weights are free, you pay only for compute.
77
Open Weight Models
$0.000
Cheapest Input /Mtok
1000
Fastest Inference
10M
Largest Context
| Model | Hosted by | Parameters | Input $/Mtok | Output $/Mtok | Context | Notes |
|---|---|---|---|---|---|---|
| GLM-4.7-Flash | Zhipu | — | $0.000 | $0.000 | 128K | |
| Granite 4.0 Micro | IBM | — | $0.017 | $0.112 | 128K | |
| LFM2 24B A2B | Together | 24B (A2B) | $0.030 | $0.120 | 128K | |
| GLM-OCR | Zhipu | — | $0.030 | $0.030 | 128K | |
| Llama 3.1 8B Instant | Groq | 8B | $0.050 | $1.00 | 128K | 840 TPS |
| Llama 3.1 8B | Meta | 8B | $0.050 | $0.080 | 128K | |
| Granite 4 H Small | IBM | — | $0.060 | $0.250 | 128K | |
| Baichuan M2-32B | Baichuan | 32B | $0.070 | $0.070 | 33K | |
| GPT OSS 20B | Fireworks | 20B | $0.070 | $0.300 | 128K | cached $0.035 |
| GLM-4.7-FlashX | Zhipu | — | $0.070 | $0.400 | 128K | cached $0.010 |
| GPT-OSS 20B | Groq | 20B | $0.075 | $1.00 | 128K | 1000 TPS |
| Mistral Small 3.2 24B | Mistral | 24B | $0.080 | $0.200 | 128K | |
| Ministral 3 3B | Mistral | 3B | $0.100 | $0.100 | 128K | |
| Pixtral 12B | Mistral | 12B | $0.100 | $0.100 | 128K | |
| Voxtral Small 24B | Mistral | 24B | $0.100 | $0.300 | 128K | |
| Nemotron 70B Instruct | NVIDIA | 70B | $0.100 | $0.100 | 128K | |
| GLM-4-32B-0414 | Zhipu | 32B | $0.100 | $0.100 | 128K | |
| Granite Embedding 278M Multilingual | IBM | 278M | $0.106 | $0.106 | — | |
| Llama 4 Scout | Groq | 17B (16 experts) | $0.110 | $1.00 | 10M | 594 TPS |
| Llama 4 Scout | Meta | 17B (16 experts) | $0.110 | $0.340 | 10M | |
| GPT OSS 120B | Fireworks | 120B | $0.150 | $0.600 | 128K | cached $0.015 |
| GPT-OSS 120B | Groq | 120B | $0.150 | $1.00 | 128K | 500 TPS |
| Granite 4 H Medium | IBM | — | $0.150 | $0.600 | 128K | |
| gpt-oss-120B | Together | 120B | $0.150 | $0.600 | 128K | |
| Jamba Mini | AI21 Labs | — | $0.200 | $0.400 | 256K | |
| GLM-4.5-Air | Zhipu | — | $0.200 | $1.10 | 128K | cached $0.030 |
| Gemma-4-31B-it-Pearl | Together | 31B | $0.280 | $0.860 | 128K | |
| Qwen3 32B | Groq | 32B | $0.290 | $1.00 | 128K | 662 TPS |
| MiniMax 2.5 | Fireworks | — | $0.300 | $1.20 | 128K | cached $0.030 |
| MiniMax 2.7 | Fireworks | — | $0.300 | $1.20 | 128K | cached $0.060 |
| MiniMax M3 | Fireworks | — | $0.300 | $1.20 | 1M | cached $0.060 |
| Granite 4 H Large | IBM | — | $0.300 | $1.20 | 128K | |
| MiniMax-M2 | MiniMax | — | $0.300 | $1.20 | 205K | cached $0.030 |
| MiniMax-M2.1 | MiniMax | — | $0.300 | $1.20 | 205K | cached $0.030 |
| MiniMax-M2.5 | MiniMax | — | $0.300 | $1.20 | 205K | cached $0.030 |
| MiniMax-M2.7 | MiniMax | — | $0.300 | $1.20 | 205K | cached $0.060 |
| MiniMax-M3 | MiniMax | — | $0.300 | $1.20 | 1M | cached $0.060 |
| MiniMax M2.5 | Together | — | $0.300 | $1.20 | 128K | cached $0.060 |
| MiniMax M3 | Together | — | $0.300 | $1.20 | 1M | cached $0.060 |
| GLM-4.6V | Zhipu | — | $0.300 | $0.900 | 128K | cached $0.050 |
| Qwen3.7-Plus | Together | — | $0.320 | $1.28 | 128K | |
| Qwen 3.7 Plus | Fireworks | — | $0.400 | $1.60 | 128K | cached $0.080 |
| Devstral 2 2512 | Mistral | — | $0.400 | $2.00 | 256K | |
| Qwen 3.6 Plus | Fireworks | — | $0.500 | $3.00 | 128K | cached $0.100 |
| Mixtral 8x7B Instruct | Mistral | 46.7B (8x7B MoE) | $0.540 | $0.540 | 32K | |
| Llama 3.3 70B Versatile | Groq | 70B | $0.590 | $1.00 | 128K | 394 TPS |
| Llama 3.3 70B | Meta | 70B | $0.590 | $0.790 | 128K | |
| Kimi K2.5 | Fireworks | — | $0.600 | $3.00 | 256K | cached $0.100 |
| NVIDIA Nemotron 3 Ultra | Fireworks | — | $0.600 | $2.40 | 128K | cached $0.120 |
| Qwen 3.6 27B | Groq | 27B | $0.600 | $1.00 | 128K | 500 TPS |
| Kimi K2.5 | Moonshot | — | $0.600 | $3.00 | 262K | cached $0.100 |
| Llama Nemotron Ultra 253B | NVIDIA | 253B | $0.600 | $3.60 | 128K | |
| Nemotron 3 Ultra | NVIDIA | — | $0.600 | $3.60 | 128K | cached $0.120 |
| NVIDIA Nemotron 3 Ultra | Together | — | $0.600 | $3.60 | 128K | cached $0.200 |
| Qwen3.5-397B-A17B | Together | 397B (A17B) | $0.600 | $3.60 | 128K | cached $0.350 |
| GLM-4.5 | Zhipu | — | $0.600 | $2.20 | 128K | cached $0.110 |
| GLM-4.6 | Zhipu | — | $0.600 | $2.20 | 128K | cached $0.110 |
| GLM-4.7 | Zhipu | — | $0.600 | $2.20 | 128K | cached $0.110 |
| Kimi K2.6 | Fireworks | — | $0.950 | $4.00 | 256K | cached $0.160 |
| Kimi K2.7 Code | Fireworks | — | $0.950 | $4.00 | 256K | cached $0.190 |
| Kimi K2.6 | Moonshot | — | $0.950 | $4.00 | 262K | cached $0.160 |
| Kimi K2.7 Code | Moonshot | — | $0.950 | $4.00 | 262K | cached $0.190 |
| Kimi K2.7 Code | Together | — | $0.950 | $4.00 | 256K | cached $0.190 |
| GLM-5 | Zhipu | — | $1.00 | $3.20 | 128K | cached $0.200 |
| Llama 3.3 70B | Together | 70B | $1.04 | $1.04 | 128K | |
| GLM-5-Turbo | Zhipu | — | $1.20 | $4.00 | 128K | cached $0.240 |
| GLM-5V-Turbo | Zhipu | — | $1.20 | $4.00 | 128K | cached $0.240 |
| Qwen3.7-Max | Together | — | $1.25 | $3.75 | 128K | cached $0.130 |
| GLM 5.1 | Fireworks | — | $1.40 | $4.40 | 128K | cached $0.260 |
| GLM 5.2 | Fireworks | — | $1.40 | $4.40 | 128K | cached $0.260 |
| GLM-5.2 | Together | — | $1.40 | $4.40 | 128K | cached $0.260 |
| GLM-5.1 | Zhipu | — | $1.40 | $4.40 | 128K | cached $0.260 |
| GLM-5.2 | Zhipu | — | $1.40 | $4.40 | 128K | cached $0.260 |
| Kimi K2.7 Code HighSpeed | Moonshot | — | $1.90 | $8.00 | 262K | cached $0.380 |
| Jamba Large | AI21 Labs | — | $2.00 | $8.00 | 256K | |
| Mixtral 8x22B Instruct | Mistral | 141B (8x22B MoE) | $2.00 | $6.00 | 64K | |
| Pixtral Large 2411 | Mistral | 124B | $2.00 | $6.00 | 128K |
Open weight models have freely available model weights — anyone can download and run them. You pay only for the compute to serve inference. Hosting providers like Groq (custom LPU chips, fastest inference), Together AI (200+ models, competitive pricing), and Fireworks offer per-token pricing without lock-in.
The same model can have very different prices depending on the host. For example, Llama 3.3 70B costs $0.59/$0.79 on Groq but $1.04/$1.04 on Together. Use the Compare page to side-by-side the same model across providers.