Groq

Hosting provider · 7 models tracked · Founded 2016

Inference platform running open models on custom LPU chips for extreme speed. Up to 1000 tokens/sec, cheapest hosted prices for many open models.

Website → Pricing page → Sign up →

7 models

All Groq models

Sorted by blended cost (cheapest first). Prices per 1M tokens.

Model	Input	Output	Blended	Context	Status
Llama 3.1 8B Instant	$0.050	$1.00	$0.525	128K	Current
GPT-OSS 20B	$0.075	$1.00	$0.537	128K	Current
Llama 4 Scout	$0.110	$1.00	$0.555	10M	Current
GPT-OSS 120B	$0.150	$1.00	$0.575	128K	Current
Qwen3 32B	$0.290	$1.00	$0.645	128K	Current
Llama 3.3 70B Versatile	$0.590	$1.00	$0.795	128K	Current
Qwen 3.6 27B	$0.600	$1.00	$0.800	128K	Current