W
WandB
AGGREGATEDINFERENCE
N/A
Uptime
N/A
Rating
30-Day Uptime
96.7%2026-05-222026-06-20
Inference Latency
OpenAI: gpt-oss-20b294ms TTFT · 171 TPS
OpenAI: gpt-oss-120b320ms TTFT · 113 TPS
IBM: Granite 4.1 8B179ms TTFT · 112 TPS
Microsoft: Phi 4 Mini Instruct98ms TTFT · 182 TPS
Qwen: Qwen3 235B A22B Thinking 2507302ms TTFT · 82 TPS
Qwen: Qwen3 30B A3B Instruct 2507288ms TTFT · 61 TPS
Qwen: Qwen3 235B A22B Instruct 2507273ms TTFT · 73 TPS
Google: Gemma 4 31B408ms TTFT · 43 TPS
DeepSeek: DeepSeek V4 Flash572ms TTFT · 21 TPS
Meta: Llama 3.1 8B Instruct168ms TTFT · 103 TPS
Qwen: Qwen3.6 35B A3B242ms TTFT · 206 TPS
MiniMax: MiniMax M2.5541ms TTFT · 82 TPS
DeepSeek: DeepSeek V3.1354ms TTFT · 29 TPS
Qwen: Qwen3.6 27B749ms TTFT · 95 TPS
Meta: Llama 3.3 70B Instruct251ms TTFT · 54 TPS
Meta: Llama 3.1 70B Instruct324ms TTFT · 28 TPS
MoonshotAI: Kimi K2.6401ms TTFT · 143 TPS
Qwen: Qwen3 Coder 480B A35B3666ms TTFT · 24 TPS
DeepSeek: DeepSeek V4 Pro598ms TTFT · 13 TPS
Inference Models
| Model | Input $/M | Output $/M | TTFT | TPS |
|---|---|---|---|---|
| OpenAI: gpt-oss-20b | $0.03 | $0.13 | 294ms | 171 |
| OpenAI: gpt-oss-120b | $0.04 | $0.14 | 320ms | 113 |
| IBM: Granite 4.1 8B | $0.05 | $0.10 | 179ms | 112 |
| Microsoft: Phi 4 Mini Instruct | $0.08 | $0.35 | 98ms | 182 |
| Qwen: Qwen3 235B A22B Thinking 2507 | $0.10 | $0.10 | 302ms | 82 |
| Qwen: Qwen3 30B A3B Instruct 2507 | $0.10 | $0.30 | 288ms | 61 |
| Qwen: Qwen3 235B A22B Instruct 2507 | $0.10 | $0.10 | 273ms | 73 |
| Google: Gemma 4 31B | $0.12 | $0.35 | 408ms | 43 |
| DeepSeek: DeepSeek V4 Flash | $0.14 | $0.28 | 572ms | 21 |
| Meta: Llama 3.1 8B Instruct | $0.22 | $0.22 | 168ms | 103 |
| Qwen: Qwen3.5-35B-A3B | $0.25 | $1.25 | — | — |
| Qwen: Qwen3.6 35B A3B | $0.25 | $1.25 | 242ms | 206 |
| MiniMax: MiniMax M2.5 | $0.30 | $1.20 | 541ms | 82 |
| DeepSeek: DeepSeek V3.1 | $0.55 | $1.65 | 354ms | 29 |
| Qwen: Qwen3.6 27B | $0.60 | $3.60 | 749ms | 95 |
| Meta: Llama 3.3 70B Instruct | $0.71 | $0.71 | 251ms | 54 |
| Meta: Llama 3.1 70B Instruct | $0.80 | $0.80 | 324ms | 28 |
| MoonshotAI: Kimi K2.6 | $0.95 | $4.00 | 401ms | 143 |
| Qwen: Qwen3 Coder 480B A35B | $1.00 | $1.50 | 3666ms | 24 |
| DeepSeek: DeepSeek V4 Pro | $1.74 | $3.48 | 598ms | 13 |
Community Reviews
4.5★★★★★(2 reviews)
clouduser42
★★★★★2025-06-15
Reliable service, great API documentation.
mlresearcher
★★★★☆2025-06-10
Good performance but support could be faster.