LIVE
Models: —+Providers: —+Cheapest H100: $2.49/hrUpdated: 05:15 PMModels: —+Providers: —+Cheapest H100: $2.49/hrUpdated: 05:15 PM
Marketplace
Providers Models
N

Nebius

AGGREGATEDINFERENCE
N/A
Uptime
N/A
Rating

30-Day Uptime

96.7%
2026-03-232026-04-21

Inference Latency

Meta: Llama 3.1 8B Instruct341ms TTFT · 8 TPS
Qwen: Qwen2.5 Coder 7B Instruct238ms TTFT · 96 TPS
Google: Gemma 2 9B266ms TTFT · 36 TPS
Qwen: Qwen3 32B353ms TTFT · 8 TPS
Google: Gemma 3 27B527ms TTFT · 27 TPS
Qwen: Qwen3 30B A3B Instruct 2507232ms TTFT · 40 TPS
Nous: Hermes 4 70B482ms TTFT · 59 TPS
Meta: Llama 3.3 70B Instruct3834ms TTFT · 12 TPS
Qwen: Qwen3 Next 80B A3B Thinking410ms TTFT · 105 TPS
OpenAI: gpt-oss-120b390ms TTFT · 68 TPS
Prime Intellect: INTELLECT-3189ms TTFT · 128 TPS
Qwen: Qwen2.5 VL 72B Instruct818ms TTFT · 23 TPS
NVIDIA: Nemotron 3 Super1332ms TTFT · 49 TPS
NVIDIA: Llama 3.1 Nemotron Ultra 253B v1320ms TTFT · 38 TPS
Nous: Hermes 4 405B454ms TTFT · 26 TPS
Z.ai: GLM 51368ms TTFT · 39 TPS

Inference Models

ModelInput $/MOutput $/MTTFTTPS
Meta: Llama 3.1 8B Instruct$0.02$0.06341ms8
Qwen: Qwen2.5 Coder 7B Instruct$0.03$0.09238ms96
Google: Gemma 2 9B$0.03$0.09266ms36
Qwen: Qwen3 32B$0.10$0.30353ms8
Google: Gemma 3 27B$0.10$0.30527ms27
Qwen: Qwen3 30B A3B Instruct 2507$0.10$0.30232ms40
Nous: Hermes 4 70B$0.13$0.40482ms59
Meta: Llama 3.3 70B Instruct$0.13$0.403834ms12
Qwen: Qwen3 Next 80B A3B Thinking$0.15$1.20410ms105
OpenAI: gpt-oss-120b$0.15$0.60390ms68
Prime Intellect: INTELLECT-3$0.20$1.10189ms128
Qwen: Qwen2.5 VL 72B Instruct$0.25$0.75818ms23
NVIDIA: Nemotron 3 Super$0.30$0.901332ms49
MiniMax: MiniMax M2.5$0.30$1.20
NVIDIA: Llama 3.1 Nemotron Ultra 253B v1$0.60$1.80320ms38
Nous: Hermes 4 405B$1.00$3.00454ms26
Z.ai: GLM 5$1.00$3.201368ms39

Community Reviews

4.5★★★★★(2 reviews)
clouduser42
★★★★★2025-06-15

Reliable service, great API documentation.

mlresearcher
★★★★2025-06-10

Good performance but support could be faster.