C
Cerebras
AGGREGATEDINFERENCE
N/A
Uptime
N/A
Rating
30-Day Uptime
96.7%2026-03-232026-04-21
Inference Latency
Meta: Llama 3.1 8B Instruct200ms TTFT · 60 TPS
OpenAI: gpt-oss-120b791ms TTFT · 301 TPS
Qwen: Qwen3 235B A22B Instruct 2507837ms TTFT · 3 TPS
Z.ai: GLM 4.7450ms TTFT · 352 TPS
Inference Models
| Model | Input $/M | Output $/M | TTFT | TPS |
|---|---|---|---|---|
| Meta: Llama 3.1 8B Instruct | $0.10 | $0.10 | 200ms | 60 |
| OpenAI: gpt-oss-120b | $0.35 | $0.75 | 791ms | 301 |
| Qwen: Qwen3 235B A22B Instruct 2507 | $0.60 | $1.20 | 837ms | 3 |
| Z.ai: GLM 4.7 | $2.25 | $2.75 | 450ms | 352 |
Community Reviews
4.5★★★★★(2 reviews)
clouduser42
★★★★★2025-06-15
Reliable service, great API documentation.
mlresearcher
★★★★☆2025-06-10
Good performance but support could be faster.