About Groq
Inference platform running open models on custom LPU chips for extreme speed. Up to 1000 tokens/sec, cheapest hosted prices for many open models.
7 models
All Groq models
Sorted by blended cost (cheapest first). Prices per 1M tokens.
| Model | Input | Output | Blended | Relative cost | Context | Status |
|---|---|---|---|---|---|---|
| Llama 3.1 8B Instant | $0.050 | $1.00 | $0.525 |
128K | Current | |
| GPT-OSS 20B | $0.075 | $1.00 | $0.537 |
128K | Current | |
| Llama 4 Scout | $0.110 | $1.00 | $0.555 |
10M | Current | |
| GPT-OSS 120B | $0.150 | $1.00 | $0.575 |
128K | Current | |
| Qwen3 32B | $0.290 | $1.00 | $0.645 |
128K | Current | |
| Llama 3.3 70B Versatile | $0.590 | $1.00 | $0.795 |
128K | Current | |
| Qwen 3.6 27B | $0.600 | $1.00 | $0.800 |
128K | Current |
Quick stats
Open-weight models
7 open-weight models available from this provider.