LIVE
Models: —+Providers: —+Cheapest H100: $2.49/hrUpdated: 04:37 PMModels: —+Providers: —+Cheapest H100: $2.49/hrUpdated: 04:37 PM
Marketplace
Providers Models
N

Novita

AGGREGATEDINFERENCE
N/A
Uptime
N/A
Rating

30-Day Uptime

100%
2026-05-222026-06-20

Inference Latency

inclusionAI: Ring-2.6-1T (free)2682ms TTFT · 57 TPS
inclusionAI: Ling-2.6-1T (free)2597ms TTFT · 25 TPS
inclusionAI: Ling-2.6-flash (free)3047ms TTFT · 19 TPS
inclusionAI: Ling-2.6-flash858ms TTFT · 87 TPS
Meta: Llama 3.1 8B Instruct681ms TTFT · 41 TPS
OpenAI: gpt-oss-20b730ms TTFT · 135 TPS
Mistral: Mistral Nemo856ms TTFT · 19 TPS
OpenAI: gpt-oss-120b (exacto)633ms TTFT · 51 TPS
OpenAI: gpt-oss-120b434ms TTFT · 173 TPS
NVIDIA: Nemotron 3 Nano 30B A3B1016ms TTFT · 88 TPS
Sao10K: Llama 3 8B Lunaris289ms TTFT · 71 TPS
Qwen: Qwen3 Coder 30B A3B Instruct1195ms TTFT · 93 TPS
Z.ai: GLM 4.7 Flash1476ms TTFT · 36 TPS
inclusionAI: Ring-2.6-1T2173ms TTFT · 77 TPS
inclusionAI: Ling-2.6-1T2335ms TTFT · 37 TPS
Qwen: Qwen3 VL 8B Instruct980ms TTFT · 36 TPS
Qwen: Qwen3 235B A22B Instruct 25071303ms TTFT · 29 TPS
Google: Gemma 3 27B25605ms TTFT · 5 TPS
Google: Gemma 4 26B A4B 1138ms TTFT · 25 TPS
Z.ai: GLM 4.5 Air1067ms TTFT · 37 TPS

Inference Models

ModelInput $/MOutput $/MTTFTTPS
inclusionAI: Ring-2.6-1T (free)$0.00$0.002682ms57
inclusionAI: Ling-2.6-1T (free)$0.00$0.002597ms25
inclusionAI: Ling-2.6-flash (free)$0.00$0.003047ms19
inclusionAI: Ling-2.6-flash$0.01$0.03858ms87
Meta: Llama 3.1 8B Instruct$0.02$0.05681ms41
OpenAI: gpt-oss-20b$0.04$0.15730ms135
Mistral: Mistral Nemo$0.04$0.17856ms19
OpenAI: gpt-oss-120b (exacto)$0.04$0.20633ms51
OpenAI: gpt-oss-120b$0.05$0.25434ms173
NVIDIA: Nemotron 3 Nano 30B A3B$0.05$0.201016ms88
Sao10K: Llama 3 8B Lunaris$0.05$0.05289ms71
Baidu: ERNIE 4.5 21B A3B Thinking$0.07$0.28
Baidu: ERNIE 4.5 21B A3B$0.07$0.28
Qwen: Qwen3 Coder 30B A3B Instruct$0.07$0.271195ms93
Z.ai: GLM 4.7 Flash$0.07$0.401476ms36
inclusionAI: Ring-2.6-1T$0.08$0.632173ms77
inclusionAI: Ling-2.6-1T$0.08$0.632335ms37
Qwen: Qwen3 VL 8B Instruct$0.08$0.50980ms36
Qwen: Qwen3 235B A22B Instruct 2507$0.09$0.581303ms29
Google: Gemma 3 27B$0.12$0.2025605ms5
Google: Gemma 4 26B A4B $0.13$0.401138ms25
Z.ai: GLM 4.5 Air$0.13$0.851067ms37
Meta: Llama 3.3 70B Instruct$0.14$0.40
Google: Gemma 4 31B$0.14$0.401459ms16
Baidu: ERNIE 4.5 VL 28B A3B$0.14$0.56
DeepSeek: DeepSeek V4 Flash$0.14$0.281031ms33
NousResearch: Hermes 2 Pro - Llama-3 8B$0.14$0.14
Qwen: Qwen3 Next 80B A3B Thinking$0.15$1.501111ms225
Qwen: Qwen3 Next 80B A3B Instruct$0.15$1.501036ms82
Meta: Llama 4 Scout$0.18$0.59420ms43
Qwen: Qwen3 Coder Next$0.20$1.502750ms20
StepFun: Step 3.7 Flash$0.20$1.15
Qwen: Qwen3 VL 30B A3B Instruct$0.20$0.702025ms23
Qwen: Qwen3 VL 30B A3B Thinking$0.20$1.004011ms82
Kwaipilot: KAT-Coder-Pro V1$0.21$0.831788ms56
DeepSeek: DeepSeek V3.1 Terminus (exacto)$0.22$0.802354ms26
DeepSeek: DeepSeek V3.2$0.27$0.401506ms31
DeepSeek: DeepSeek V3.1$0.27$1.001709ms9
MiniMax: MiniMax M2.7$0.27$1.081341ms45
Meta: Llama 4 Maverick$0.27$0.85553ms26
DeepSeek: DeepSeek V3 0324$0.27$1.121166ms26
DeepSeek: DeepSeek V3.2 Exp$0.27$0.411385ms19
DeepSeek: DeepSeek V3.1 Terminus$0.27$1.001672ms24
Baidu: ERNIE 4.5 300B A47B $0.28$1.101629ms21
Z.ai: GLM 4.6V$0.30$0.901650ms42
MiniMax: MiniMax M2$0.30$1.201371ms71
Qwen: Qwen3 VL 235B A22B Instruct$0.30$1.502251ms18
MiniMax: MiniMax M3$0.30$1.201468ms45
MiniMax: MiniMax M2.5$0.30$1.20
Qwen: Qwen3.5-27B$0.30$2.40
MiniMax: MiniMax M2.1$0.30$1.201459ms43
Qwen: Qwen3 235B A22B Thinking 2507$0.30$3.00476ms49
Qwen2.5 72B Instruct$0.38$0.4036831ms3
Qwen: Qwen3 Coder 480B A35B$0.38$1.551389ms7
DeepSeek: DeepSeek V3$0.40$1.301251ms24
Qwen: Qwen3.5-122B-A10B$0.40$3.20880ms36
Baidu: ERNIE 4.5 VL 424B A47B $0.42$1.251411ms38
Z.ai: GLM 4.6 (exacto)$0.44$1.76834ms119
Meta: Llama 3 70B Instruct$0.51$0.742071ms1
Xiaomi: MiMo-V2.5-Pro$0.52$1.042509ms33
Z.ai: GLM 4.7$0.54$1.982585ms32
MiniMax: MiniMax M1$0.55$2.201265ms68
Z.ai: GLM 4.6$0.55$2.202238ms43
MoonshotAI: Kimi K2.5$0.57$2.853839ms28
MoonshotAI: Kimi K2 0711$0.57$2.301527ms13
Z.ai: GLM 4.5V$0.60$1.802370ms37
MoonshotAI: Kimi K2 0905$0.60$2.501553ms9
Qwen: Qwen3.5 397B A17B$0.60$3.601391ms43
MoonshotAI: Kimi K2 Thinking$0.60$2.501163ms66
WizardLM-2 8x22B$0.62$0.62976ms11
DeepSeek: R1 0528$0.70$2.50
DeepSeek: R1$0.70$2.502011ms27
DeepSeek: R1 Distill Llama 70B$0.80$0.80857ms23
MoonshotAI: Kimi K2.6$0.80$3.401930ms49
MoonshotAI: Kimi K2.7 Code$0.95$4.003117ms31
Qwen: Qwen3 VL 235B A22B Thinking$0.98$3.951293ms35
Z.ai: GLM 5$1.00$3.203566ms22
Z.ai: GLM 5.1$1.38$4.402160ms35
Z.ai: GLM 5.2$1.40$4.401963ms26
Sao10K: Llama 3.1 Euryale 70B v2.2$1.48$1.48500ms49
Sao10k: Llama 3 Euryale 70B v2.1$1.48$1.48
DeepSeek: DeepSeek V4 Pro$1.60$3.201235ms55

Community Reviews

4.5★★★★★(2 reviews)
clouduser42
★★★★★2025-06-15

Reliable service, great API documentation.

mlresearcher
★★★★2025-06-10

Good performance but support could be faster.