LIVE
Models: —+Providers: —+Cheapest H100: $2.49/hrUpdated: 05:21 PMModels: —+Providers: —+Cheapest H100: $2.49/hrUpdated: 05:21 PM
Marketplace
Providers Models
N

Novita

AGGREGATEDINFERENCE
N/A
Uptime
N/A
Rating

30-Day Uptime

96.7%
2026-03-232026-04-21

Inference Latency

Meta: Llama 3.1 8B Instruct641ms TTFT · 31 TPS
Meta: Llama 3 8B Instruct500ms TTFT · 25 TPS
Mistral: Mistral Nemo1044ms TTFT · 22 TPS
OpenAI: gpt-oss-120b (exacto)633ms TTFT · 51 TPS
OpenAI: gpt-oss-20b676ms TTFT · 89 TPS
OpenAI: gpt-oss-120b625ms TTFT · 21 TPS
Sao10K: Llama 3 8B Lunaris3864ms TTFT · 23 TPS
Z.ai: GLM 4.7 Flash1159ms TTFT · 58 TPS
Baidu: ERNIE 4.5 21B A3B1473ms TTFT · 67 TPS
Qwen: Qwen3 Coder 30B A3B Instruct1282ms TTFT · 58 TPS
Qwen: Qwen3 VL 8B Instruct740ms TTFT · 44 TPS
Qwen: Qwen3 235B A22B Instruct 25071271ms TTFT · 19 TPS
Qwen: Qwen3 30B A3B921ms TTFT · 43 TPS
Qwen: Qwen3 32B477ms TTFT · 50 TPS
Xiaomi: MiMo-V2-Flash1772ms TTFT · 14 TPS
Google: Gemma 3 27B998ms TTFT · 28 TPS
Google: Gemma 4 26B A4B 1063ms TTFT · 36 TPS
Z.ai: GLM 4.5 Air762ms TTFT · 50 TPS
Meta: Llama 3.3 70B Instruct684ms TTFT · 24 TPS
Google: Gemma 4 31B787ms TTFT · 5 TPS

Inference Models

ModelInput $/MOutput $/MTTFTTPS
Meta: Llama 3.1 8B Instruct$0.02$0.05641ms31
Meta: Llama 3 8B Instruct$0.04$0.04500ms25
Mistral: Mistral Nemo$0.04$0.171044ms22
OpenAI: gpt-oss-120b (exacto)$0.04$0.20633ms51
OpenAI: gpt-oss-20b$0.04$0.15676ms89
OpenAI: gpt-oss-120b$0.05$0.25625ms21
Sao10K: Llama 3 8B Lunaris$0.05$0.053864ms23
Baidu: ERNIE 4.5 21B A3B Thinking$0.07$0.28
Z.ai: GLM 4.7 Flash$0.07$0.401159ms58
Baidu: ERNIE 4.5 21B A3B$0.07$0.281473ms67
Qwen: Qwen3 Coder 30B A3B Instruct$0.07$0.271282ms58
Qwen: Qwen3 VL 8B Instruct$0.08$0.50740ms44
Qwen: Qwen3 235B A22B Instruct 2507$0.09$0.581271ms19
Qwen: Qwen3 30B A3B$0.09$0.45921ms43
Qwen: Qwen3 32B$0.10$0.45477ms50
Xiaomi: MiMo-V2-Flash$0.10$0.301772ms14
Google: Gemma 3 27B$0.12$0.20998ms28
Google: Gemma 4 26B A4B $0.13$0.401063ms36
Z.ai: GLM 4.5 Air$0.13$0.85762ms50
Meta: Llama 3.3 70B Instruct$0.14$0.40684ms24
Baidu: ERNIE 4.5 VL 28B A3B$0.14$0.56
NousResearch: Hermes 2 Pro - Llama-3 8B$0.14$0.14
Google: Gemma 4 31B$0.14$0.40787ms5
Qwen: Qwen3 Next 80B A3B Thinking$0.15$1.50938ms157
Qwen: Qwen3 Next 80B A3B Instruct$0.15$1.50723ms3
Meta: Llama 4 Scout$0.18$0.59469ms34
Qwen: Qwen3 VL 30B A3B Thinking$0.20$1.002930ms68
Qwen: Qwen3 Coder Next$0.20$1.501048ms97
Qwen: Qwen3 VL 30B A3B Instruct$0.20$0.702254ms24
Kwaipilot: KAT-Coder-Pro V1$0.21$0.831788ms56
DeepSeek: DeepSeek V3.1 Terminus (exacto)$0.22$0.802354ms26
DeepSeek: DeepSeek V3.2$0.27$0.401523ms24
Meta: Llama 4 Maverick$0.27$0.85598ms18
DeepSeek: DeepSeek V3 0324$0.27$1.121195ms27
DeepSeek: DeepSeek V3.1$0.27$1.001882ms22
DeepSeek: DeepSeek V3.2 Exp$0.27$0.411302ms13
DeepSeek: DeepSeek V3.1 Terminus$0.27$1.001589ms30
Baidu: ERNIE 4.5 300B A47B $0.28$1.103963ms27
Z.ai: GLM 4.6V$0.30$0.901328ms20
Qwen: Qwen3 VL 235B A22B Instruct$0.30$1.503434ms10
MiniMax: MiniMax M2.1$0.30$1.202205ms27
MiniMax: MiniMax M2.5$0.30$1.203208ms29
Qwen: Qwen3 Coder 480B A35B$0.30$1.30931ms2
MiniMax: MiniMax M2$0.30$1.20
Qwen: Qwen3.5-27B$0.30$2.40708ms8
Qwen: Qwen3 235B A22B Thinking 2507$0.30$3.00726ms29
Qwen2.5 72B Instruct$0.38$0.40
Qwen: Qwen3.5-122B-A10B$0.40$3.20694ms12
DeepSeek: DeepSeek V3$0.40$1.301303ms21
Baidu: ERNIE 4.5 VL 424B A47B $0.42$1.251099ms9
MiniMax: MiniMax M1$0.44$1.763343ms9
Z.ai: GLM 4.6 (exacto)$0.44$1.76834ms119
Meta: Llama 3 70B Instruct$0.51$0.74679ms16
Z.ai: GLM 4.7$0.54$1.981452ms35
Z.ai: GLM 4.6$0.55$2.201544ms34
MoonshotAI: Kimi K2.5$0.57$2.856675ms22
MoonshotAI: Kimi K2 0711$0.57$2.301069ms15
Z.ai: GLM 4.5V$0.60$1.801248ms52
Qwen: Qwen3.5 397B A17B$0.60$3.601277ms51
Z.ai: GLM 4.5$0.60$2.20709ms43
MoonshotAI: Kimi K2 Thinking$0.60$2.501400ms18
MoonshotAI: Kimi K2 0905$0.60$2.502134ms13
WizardLM-2 8x22B$0.62$0.621208ms8
DeepSeek: R1$0.70$2.501604ms31
DeepSeek: R1 0528$0.70$2.50
DeepSeek: R1 Distill Llama 70B$0.80$0.801315ms35
Qwen: Qwen2.5 VL 72B Instruct$0.80$0.802369ms20
MoonshotAI: Kimi K2.6$0.95$4.003058ms28
Qwen: Qwen3 VL 235B A22B Thinking$0.98$3.95
Z.ai: GLM 5$1.00$3.201754ms29
Z.ai: GLM 5.1$1.40$4.401976ms31
Sao10k: Llama 3 Euryale 70B v2.1$1.48$1.48
Sao10K: Llama 3.1 Euryale 70B v2.2$1.48$1.48618ms36

Community Reviews

4.5★★★★★(2 reviews)
clouduser42
★★★★★2025-06-15

Reliable service, great API documentation.

mlresearcher
★★★★2025-06-10

Good performance but support could be faster.