LIVE Cheapest: GLM-4.7-Flash $0.000/Mtok in 153 models tracked Updated Invalid Date
Invalid Date
LLMPrice$/Mtok

Groq

Hosting provider · 7 models tracked · Founded 2016

About Groq

Inference platform running open models on custom LPU chips for extreme speed. Up to 1000 tokens/sec, cheapest hosted prices for many open models.

7 models

All Groq models

Sorted by blended cost (cheapest first). Prices per 1M tokens.

Model Input Output Blended Relative cost Context Status
Llama 3.1 8B Instant $0.050 $1.00
$0.525
128K Current
GPT-OSS 20B $0.075 $1.00
$0.537
128K Current
Llama 4 Scout $0.110 $1.00
$0.555
10M Current
GPT-OSS 120B $0.150 $1.00
$0.575
128K Current
Qwen3 32B $0.290 $1.00
$0.645
128K Current
Llama 3.3 70B Versatile $0.590 $1.00
$0.795
128K Current
Qwen 3.6 27B $0.600 $1.00
$0.800
128K Current

Quick stats

Models tracked7
Typehosting
Founded2016
Cheapest modelLlama 3.1 8B Instant
Cheapest blended$0.525/M

Open-weight models

7 open-weight models available from this provider.

Try Groq

Sign up and start building with Groq models.

Get started →