LIVE
Models: —+Providers: —+Cheapest H100: $2.49/hrUpdated: 04:31 PMModels: —+Providers: —+Cheapest H100: $2.49/hrUpdated: 04:31 PM
Marketplace
Providers Models
N

Nebius

AGGREGATEDINFERENCE
N/A
Uptime
N/A
Rating

30-Day Uptime

96.7%
2026-05-222026-06-20

Inference Latency

Google: Gemma 2 9B266ms TTFT · 36 TPS
Qwen: Qwen2.5 Coder 7B Instruct238ms TTFT · 96 TPS
NVIDIA: Nemotron 3 Nano 30B A3B5710ms TTFT · 117 TPS
Qwen: Qwen3 32B382ms TTFT · 20 TPS
Google: Gemma 3 27B999ms TTFT · 15 TPS
Qwen: Qwen3 30B A3B Instruct 2507417ms TTFT · 7 TPS
Meta: Llama 3.3 70B Instruct549ms TTFT · 22 TPS
Nous: Hermes 4 70B276ms TTFT · 63 TPS
OpenAI: gpt-oss-120b230ms TTFT · 339 TPS
Qwen: Qwen3 Next 80B A3B Thinking112ms TTFT · 113 TPS
Prime Intellect: INTELLECT-3176ms TTFT · 59 TPS
NVIDIA: Llama 3.1 Nemotron Ultra 253B v1320ms TTFT · 38 TPS
NVIDIA: Nemotron 3 Ultra747ms TTFT · 69 TPS
Nous: Hermes 4 405B450ms TTFT · 30 TPS

Inference Models

ModelInput $/MOutput $/MTTFTTPS
Google: Gemma 2 9B$0.03$0.09266ms36
Qwen: Qwen2.5 Coder 7B Instruct$0.03$0.09238ms96
NVIDIA: Nemotron 3 Nano 30B A3B$0.06$0.245710ms117
Qwen: Qwen3 32B$0.10$0.30382ms20
Google: Gemma 3 27B$0.10$0.30999ms15
Qwen: Qwen3 30B A3B Instruct 2507$0.10$0.30417ms7
Meta: Llama 3.3 70B Instruct$0.13$0.40549ms22
Nous: Hermes 4 70B$0.13$0.40276ms63
OpenAI: gpt-oss-120b$0.15$0.60230ms339
Qwen: Qwen3 Next 80B A3B Thinking$0.15$1.20112ms113
Prime Intellect: INTELLECT-3$0.20$1.10176ms59
NVIDIA: Nemotron 3 Super$0.30$0.90
NVIDIA: Llama 3.1 Nemotron Ultra 253B v1$0.60$1.80320ms38
NVIDIA: Nemotron 3 Ultra$1.00$3.00747ms69
Nous: Hermes 4 405B$1.00$3.00450ms30

Community Reviews

4.5★★★★★(2 reviews)
clouduser42
★★★★★2025-06-15

Reliable service, great API documentation.

mlresearcher
★★★★2025-06-10

Good performance but support could be faster.

Nebius — Provider Scorecard — NexusGPU | NexusGPU