V
Venice
AGGREGATEDINFERENCE
N/A
Uptime
N/A
Rating
30-Day Uptime
93.3%2026-05-222026-06-20
Inference Latency
Meta: Llama 3.3 70B Instruct (free)673ms TTFT · 42 TPS
Venice: Uncensored (free)460ms TTFT · 66 TPS
Qwen: Qwen3 Next 80B A3B Instruct (free)543ms TTFT · 49 TPS
Meta: Llama 3.2 3B Instruct (free)1296ms TTFT · 49 TPS
Mistral: Mistral Small 3.2 24B480ms TTFT · 42 TPS
Qwen: Qwen3.5-9B707ms TTFT · 72 TPS
Google: Gemma 4 31B947ms TTFT · 28 TPS
Z.ai: GLM 4.7 Flash883ms TTFT · 15 TPS
Qwen: Qwen3 235B A22B Instruct 2507552ms TTFT · 17 TPS
Google: Gemma 4 26B A4B 1263ms TTFT · 10 TPS
DeepSeek: DeepSeek V4 Flash2633ms TTFT · 9 TPS
Mistral: Mistral Small 4518ms TTFT · 57 TPS
Qwen: Qwen3 VL 235B A22B Instruct2194ms TTFT · 23 TPS
Qwen: Qwen3.6 27B529ms TTFT · 71 TPS
Qwen: Qwen3 Coder 480B A35B830ms TTFT · 19 TPS
Z.ai: GLM 4.71584ms TTFT · 48 TPS
MoonshotAI: Kimi K2.51572ms TTFT · 123 TPS
Qwen: Qwen3.5 397B A17B1439ms TTFT · 48 TPS
Z.ai: GLM 4.61366ms TTFT · 24 TPS
MoonshotAI: Kimi K2.6912ms TTFT · 64 TPS
Inference Models
| Model | Input $/M | Output $/M | TTFT | TPS |
|---|---|---|---|---|
| Qwen: Qwen3 4B (free) | $0.00 | $0.00 | — | — |
| Nous: Hermes 3 405B Instruct (free) | $0.00 | $0.00 | — | — |
| Mistral: Mistral Small 3.1 24B (free) | $0.00 | $0.00 | — | — |
| Meta: Llama 3.3 70B Instruct (free) | $0.00 | $0.00 | 673ms | 42 |
| Venice: Uncensored (free) | $0.00 | $0.00 | 460ms | 66 |
| Qwen: Qwen3 Next 80B A3B Instruct (free) | $0.00 | $0.00 | 543ms | 49 |
| Qwen: Qwen3 Coder 480B A35B (free) | $0.00 | $0.00 | — | — |
| Meta: Llama 3.2 3B Instruct (free) | $0.00 | $0.00 | 1296ms | 49 |
| Mistral: Mistral Small 3.2 24B | $0.09 | $0.25 | 480ms | 42 |
| Qwen: Qwen3.5-9B | $0.10 | $0.15 | 707ms | 72 |
| Google: Gemma 4 31B | $0.12 | $0.36 | 947ms | 28 |
| Z.ai: GLM 4.7 Flash | $0.13 | $0.50 | 883ms | 15 |
| Qwen: Qwen3 235B A22B Instruct 2507 | $0.15 | $0.75 | 552ms | 17 |
| Google: Gemma 4 26B A4B | $0.16 | $0.50 | 1263ms | 10 |
| DeepSeek: DeepSeek V4 Flash | $0.17 | $0.35 | 2633ms | 9 |
| Mistral: Mistral Small 4 | $0.19 | $0.75 | 518ms | 57 |
| Qwen: Qwen3 VL 235B A22B Instruct | $0.25 | $1.50 | 2194ms | 23 |
| Arcee AI: Trinity Large Thinking | $0.31 | $1.13 | — | — |
| Qwen: Qwen3.5-35B-A3B | $0.31 | $1.25 | — | — |
| Qwen: Qwen3.6 27B | $0.33 | $3.25 | 529ms | 71 |
| MiniMax: MiniMax M2.5 | $0.34 | $1.19 | — | — |
| Qwen: Qwen3 Coder 480B A35B | $0.35 | $1.50 | 830ms | 19 |
| Z.ai: GLM 4.7 | $0.55 | $2.65 | 1584ms | 48 |
| MoonshotAI: Kimi K2.5 | $0.56 | $3.50 | 1572ms | 123 |
| Qwen: Qwen3.5 397B A17B | $0.75 | $4.50 | 1439ms | 48 |
| Z.ai: GLM 4.6 | $0.85 | $2.75 | 1366ms | 24 |
| MoonshotAI: Kimi K2.6 | $0.85 | $4.66 | 912ms | 64 |
| MoonshotAI: Kimi K2.7 Code | $0.90 | $4.30 | 1329ms | 52 |
| Z.ai: GLM 5 | $1.00 | $3.20 | 1297ms | 60 |
| DeepSeek: DeepSeek V4 Pro | $1.73 | $3.80 | 2412ms | 40 |
| Z.ai: GLM 5.1 | $1.75 | $5.50 | 1290ms | 28 |
Community Reviews
4.5★★★★★(2 reviews)
clouduser42
★★★★★2025-06-15
Reliable service, great API documentation.
mlresearcher
★★★★☆2025-06-10
Good performance but support could be faster.