P
Phala
AGGREGATEDINFERENCE
N/A
Uptime
N/A
Rating
30-Day Uptime
100%2026-05-222026-06-20
Inference Latency
Qwen: Qwen2.5 7B Instruct1056ms TTFT · 45 TPS
OpenAI: gpt-oss-20b953ms TTFT · 46 TPS
Z.ai: GLM 4.7 Flash943ms TTFT · 22 TPS
OpenAI: gpt-oss-120b1395ms TTFT · 53 TPS
Google: Gemma 3 27B2687ms TTFT · 12 TPS
Google: Gemma 4 31B3194ms TTFT · 4 TPS
MiniMax: MiniMax M2.51989ms TTFT · 19 TPS
Qwen: Qwen3.5-27B1088ms TTFT · 15 TPS
Qwen: Qwen3.6 27B3420ms TTFT · 20 TPS
Qwen: Qwen3.5 397B A17B2404ms TTFT · 8 TPS
MoonshotAI: Kimi K2.51990ms TTFT · 36 TPS
Z.ai: GLM 4.73925ms TTFT · 27 TPS
MoonshotAI: Kimi K2.62078ms TTFT · 19 TPS
Z.ai: GLM 53535ms TTFT · 25 TPS
Z.ai: GLM 5.12521ms TTFT · 27 TPS
Inference Models
| Model | Input $/M | Output $/M | TTFT | TPS |
|---|---|---|---|---|
| Qwen: Qwen2.5 7B Instruct | $0.04 | $0.10 | 1056ms | 45 |
| OpenAI: gpt-oss-20b | $0.04 | $0.15 | 953ms | 46 |
| Z.ai: GLM 4.7 Flash | $0.10 | $0.43 | 943ms | 22 |
| OpenAI: gpt-oss-120b | $0.15 | $0.60 | 1395ms | 53 |
| Google: Gemma 3 27B | $0.15 | $0.46 | 2687ms | 12 |
| Google: Gemma 4 31B | $0.15 | $0.46 | 3194ms | 4 |
| MiniMax: MiniMax M2.5 | $0.20 | $1.38 | 1989ms | 19 |
| Qwen: Qwen3 VL 30B A3B Instruct | $0.20 | $0.70 | — | — |
| Qwen: Qwen3.5-27B | $0.30 | $2.40 | 1088ms | 15 |
| Qwen: Qwen3.6 27B | $0.32 | $2.70 | 3420ms | 20 |
| Qwen: Qwen3.5 397B A17B | $0.55 | $3.50 | 2404ms | 8 |
| MoonshotAI: Kimi K2.5 | $0.60 | $3.00 | 1990ms | 36 |
| Z.ai: GLM 4.7 | $0.85 | $3.30 | 3925ms | 27 |
| MoonshotAI: Kimi K2.6 | $1.09 | $4.60 | 2078ms | 19 |
| Z.ai: GLM 5 | $1.20 | $3.50 | 3535ms | 25 |
| Z.ai: GLM 5.1 | $1.21 | $4.20 | 2521ms | 27 |
| Z.ai: GLM 5.2 | $1.40 | $4.40 | — | — |
Community Reviews
4.5★★★★★(2 reviews)
clouduser42
★★★★★2025-06-15
Reliable service, great API documentation.
mlresearcher
★★★★☆2025-06-10
Good performance but support could be faster.