W
WandB
AGGREGATEDINFERENCE
N/A
Uptime
N/A
Rating
30-Day Uptime
100%2026-03-232026-04-21
Inference Latency
OpenAI: gpt-oss-20b204ms TTFT · 270 TPS
Qwen: Qwen3 235B A22B Instruct 2507282ms TTFT · 31 TPS
Qwen: Qwen3 30B A3B Instruct 2507212ms TTFT · 47 TPS
OpenAI: gpt-oss-120b360ms TTFT · 59 TPS
DeepSeek: DeepSeek V3.1651ms TTFT · 32 TPS
Meta: Llama 3.3 70B Instruct199ms TTFT · 84 TPS
Meta: Llama 3.1 70B Instruct352ms TTFT · 28 TPS
Qwen: Qwen3 Coder 480B A35B5300ms TTFT · 15 TPS
Inference Models
| Model | Input $/M | Output $/M | TTFT | TPS |
|---|---|---|---|---|
| OpenAI: gpt-oss-20b | $0.05 | $0.20 | 204ms | 270 |
| Qwen: Qwen3 235B A22B Instruct 2507 | $0.10 | $0.10 | 282ms | 31 |
| Qwen: Qwen3 30B A3B Instruct 2507 | $0.10 | $0.30 | 212ms | 47 |
| OpenAI: gpt-oss-120b | $0.15 | $0.60 | 360ms | 59 |
| Meta: Llama 3.1 8B Instruct | $0.22 | $0.22 | — | — |
| DeepSeek: DeepSeek V3.1 | $0.55 | $1.65 | 651ms | 32 |
| Meta: Llama 3.3 70B Instruct | $0.71 | $0.71 | 199ms | 84 |
| Meta: Llama 3.1 70B Instruct | $0.80 | $0.80 | 352ms | 28 |
| Qwen: Qwen3 Coder 480B A35B | $1.00 | $1.50 | 5300ms | 15 |
Community Reviews
4.5★★★★★(2 reviews)
clouduser42
★★★★★2025-06-15
Reliable service, great API documentation.
mlresearcher
★★★★☆2025-06-10
Good performance but support could be faster.