LIVE
Models: —+Providers: —+Cheapest H100: $2.49/hrUpdated: 04:31 PMModels: —+Providers: —+Cheapest H100: $2.49/hrUpdated: 04:31 PM
Marketplace
Providers Models
V

Venice

AGGREGATEDINFERENCE
N/A
Uptime
N/A
Rating

30-Day Uptime

93.3%
2026-05-222026-06-20

Inference Latency

Meta: Llama 3.3 70B Instruct (free)673ms TTFT · 42 TPS
Venice: Uncensored (free)460ms TTFT · 66 TPS
Qwen: Qwen3 Next 80B A3B Instruct (free)543ms TTFT · 49 TPS
Meta: Llama 3.2 3B Instruct (free)1296ms TTFT · 49 TPS
Mistral: Mistral Small 3.2 24B480ms TTFT · 42 TPS
Qwen: Qwen3.5-9B707ms TTFT · 72 TPS
Google: Gemma 4 31B947ms TTFT · 28 TPS
Z.ai: GLM 4.7 Flash883ms TTFT · 15 TPS
Qwen: Qwen3 235B A22B Instruct 2507552ms TTFT · 17 TPS
Google: Gemma 4 26B A4B 1263ms TTFT · 10 TPS
DeepSeek: DeepSeek V4 Flash2633ms TTFT · 9 TPS
Mistral: Mistral Small 4518ms TTFT · 57 TPS
Qwen: Qwen3 VL 235B A22B Instruct2194ms TTFT · 23 TPS
Qwen: Qwen3.6 27B529ms TTFT · 71 TPS
Qwen: Qwen3 Coder 480B A35B830ms TTFT · 19 TPS
Z.ai: GLM 4.71584ms TTFT · 48 TPS
MoonshotAI: Kimi K2.51572ms TTFT · 123 TPS
Qwen: Qwen3.5 397B A17B1439ms TTFT · 48 TPS
Z.ai: GLM 4.61366ms TTFT · 24 TPS
MoonshotAI: Kimi K2.6912ms TTFT · 64 TPS

Inference Models

ModelInput $/MOutput $/MTTFTTPS
Qwen: Qwen3 4B (free)$0.00$0.00
Nous: Hermes 3 405B Instruct (free)$0.00$0.00
Mistral: Mistral Small 3.1 24B (free)$0.00$0.00
Meta: Llama 3.3 70B Instruct (free)$0.00$0.00673ms42
Venice: Uncensored (free)$0.00$0.00460ms66
Qwen: Qwen3 Next 80B A3B Instruct (free)$0.00$0.00543ms49
Qwen: Qwen3 Coder 480B A35B (free)$0.00$0.00
Meta: Llama 3.2 3B Instruct (free)$0.00$0.001296ms49
Mistral: Mistral Small 3.2 24B$0.09$0.25480ms42
Qwen: Qwen3.5-9B$0.10$0.15707ms72
Google: Gemma 4 31B$0.12$0.36947ms28
Z.ai: GLM 4.7 Flash$0.13$0.50883ms15
Qwen: Qwen3 235B A22B Instruct 2507$0.15$0.75552ms17
Google: Gemma 4 26B A4B $0.16$0.501263ms10
DeepSeek: DeepSeek V4 Flash$0.17$0.352633ms9
Mistral: Mistral Small 4$0.19$0.75518ms57
Qwen: Qwen3 VL 235B A22B Instruct$0.25$1.502194ms23
Arcee AI: Trinity Large Thinking$0.31$1.13
Qwen: Qwen3.5-35B-A3B$0.31$1.25
Qwen: Qwen3.6 27B$0.33$3.25529ms71
MiniMax: MiniMax M2.5$0.34$1.19
Qwen: Qwen3 Coder 480B A35B$0.35$1.50830ms19
Z.ai: GLM 4.7$0.55$2.651584ms48
MoonshotAI: Kimi K2.5$0.56$3.501572ms123
Qwen: Qwen3.5 397B A17B$0.75$4.501439ms48
Z.ai: GLM 4.6$0.85$2.751366ms24
MoonshotAI: Kimi K2.6$0.85$4.66912ms64
MoonshotAI: Kimi K2.7 Code$0.90$4.301329ms52
Z.ai: GLM 5$1.00$3.201297ms60
DeepSeek: DeepSeek V4 Pro$1.73$3.802412ms40
Z.ai: GLM 5.1$1.75$5.501290ms28

Community Reviews

4.5★★★★★(2 reviews)
clouduser42
★★★★★2025-06-15

Reliable service, great API documentation.

mlresearcher
★★★★2025-06-10

Good performance but support could be faster.