N
Novita
AGGREGATEDINFERENCE
N/A
Uptime
N/A
Rating
30-Day Uptime
100%2026-05-222026-06-20
Inference Latency
inclusionAI: Ring-2.6-1T (free)2682ms TTFT · 57 TPS
inclusionAI: Ling-2.6-1T (free)2597ms TTFT · 25 TPS
inclusionAI: Ling-2.6-flash (free)3047ms TTFT · 19 TPS
inclusionAI: Ling-2.6-flash858ms TTFT · 87 TPS
Meta: Llama 3.1 8B Instruct681ms TTFT · 41 TPS
OpenAI: gpt-oss-20b730ms TTFT · 135 TPS
Mistral: Mistral Nemo856ms TTFT · 19 TPS
OpenAI: gpt-oss-120b (exacto)633ms TTFT · 51 TPS
OpenAI: gpt-oss-120b434ms TTFT · 173 TPS
NVIDIA: Nemotron 3 Nano 30B A3B1016ms TTFT · 88 TPS
Sao10K: Llama 3 8B Lunaris289ms TTFT · 71 TPS
Qwen: Qwen3 Coder 30B A3B Instruct1195ms TTFT · 93 TPS
Z.ai: GLM 4.7 Flash1476ms TTFT · 36 TPS
inclusionAI: Ring-2.6-1T2173ms TTFT · 77 TPS
inclusionAI: Ling-2.6-1T2335ms TTFT · 37 TPS
Qwen: Qwen3 VL 8B Instruct980ms TTFT · 36 TPS
Qwen: Qwen3 235B A22B Instruct 25071303ms TTFT · 29 TPS
Google: Gemma 3 27B25605ms TTFT · 5 TPS
Google: Gemma 4 26B A4B 1138ms TTFT · 25 TPS
Z.ai: GLM 4.5 Air1067ms TTFT · 37 TPS
Inference Models
| Model | Input $/M | Output $/M | TTFT | TPS |
|---|---|---|---|---|
| inclusionAI: Ring-2.6-1T (free) | $0.00 | $0.00 | 2682ms | 57 |
| inclusionAI: Ling-2.6-1T (free) | $0.00 | $0.00 | 2597ms | 25 |
| inclusionAI: Ling-2.6-flash (free) | $0.00 | $0.00 | 3047ms | 19 |
| inclusionAI: Ling-2.6-flash | $0.01 | $0.03 | 858ms | 87 |
| Meta: Llama 3.1 8B Instruct | $0.02 | $0.05 | 681ms | 41 |
| OpenAI: gpt-oss-20b | $0.04 | $0.15 | 730ms | 135 |
| Mistral: Mistral Nemo | $0.04 | $0.17 | 856ms | 19 |
| OpenAI: gpt-oss-120b (exacto) | $0.04 | $0.20 | 633ms | 51 |
| OpenAI: gpt-oss-120b | $0.05 | $0.25 | 434ms | 173 |
| NVIDIA: Nemotron 3 Nano 30B A3B | $0.05 | $0.20 | 1016ms | 88 |
| Sao10K: Llama 3 8B Lunaris | $0.05 | $0.05 | 289ms | 71 |
| Baidu: ERNIE 4.5 21B A3B Thinking | $0.07 | $0.28 | — | — |
| Baidu: ERNIE 4.5 21B A3B | $0.07 | $0.28 | — | — |
| Qwen: Qwen3 Coder 30B A3B Instruct | $0.07 | $0.27 | 1195ms | 93 |
| Z.ai: GLM 4.7 Flash | $0.07 | $0.40 | 1476ms | 36 |
| inclusionAI: Ring-2.6-1T | $0.08 | $0.63 | 2173ms | 77 |
| inclusionAI: Ling-2.6-1T | $0.08 | $0.63 | 2335ms | 37 |
| Qwen: Qwen3 VL 8B Instruct | $0.08 | $0.50 | 980ms | 36 |
| Qwen: Qwen3 235B A22B Instruct 2507 | $0.09 | $0.58 | 1303ms | 29 |
| Google: Gemma 3 27B | $0.12 | $0.20 | 25605ms | 5 |
| Google: Gemma 4 26B A4B | $0.13 | $0.40 | 1138ms | 25 |
| Z.ai: GLM 4.5 Air | $0.13 | $0.85 | 1067ms | 37 |
| Meta: Llama 3.3 70B Instruct | $0.14 | $0.40 | — | — |
| Google: Gemma 4 31B | $0.14 | $0.40 | 1459ms | 16 |
| Baidu: ERNIE 4.5 VL 28B A3B | $0.14 | $0.56 | — | — |
| DeepSeek: DeepSeek V4 Flash | $0.14 | $0.28 | 1031ms | 33 |
| NousResearch: Hermes 2 Pro - Llama-3 8B | $0.14 | $0.14 | — | — |
| Qwen: Qwen3 Next 80B A3B Thinking | $0.15 | $1.50 | 1111ms | 225 |
| Qwen: Qwen3 Next 80B A3B Instruct | $0.15 | $1.50 | 1036ms | 82 |
| Meta: Llama 4 Scout | $0.18 | $0.59 | 420ms | 43 |
| Qwen: Qwen3 Coder Next | $0.20 | $1.50 | 2750ms | 20 |
| StepFun: Step 3.7 Flash | $0.20 | $1.15 | — | — |
| Qwen: Qwen3 VL 30B A3B Instruct | $0.20 | $0.70 | 2025ms | 23 |
| Qwen: Qwen3 VL 30B A3B Thinking | $0.20 | $1.00 | 4011ms | 82 |
| Kwaipilot: KAT-Coder-Pro V1 | $0.21 | $0.83 | 1788ms | 56 |
| DeepSeek: DeepSeek V3.1 Terminus (exacto) | $0.22 | $0.80 | 2354ms | 26 |
| DeepSeek: DeepSeek V3.2 | $0.27 | $0.40 | 1506ms | 31 |
| DeepSeek: DeepSeek V3.1 | $0.27 | $1.00 | 1709ms | 9 |
| MiniMax: MiniMax M2.7 | $0.27 | $1.08 | 1341ms | 45 |
| Meta: Llama 4 Maverick | $0.27 | $0.85 | 553ms | 26 |
| DeepSeek: DeepSeek V3 0324 | $0.27 | $1.12 | 1166ms | 26 |
| DeepSeek: DeepSeek V3.2 Exp | $0.27 | $0.41 | 1385ms | 19 |
| DeepSeek: DeepSeek V3.1 Terminus | $0.27 | $1.00 | 1672ms | 24 |
| Baidu: ERNIE 4.5 300B A47B | $0.28 | $1.10 | 1629ms | 21 |
| Z.ai: GLM 4.6V | $0.30 | $0.90 | 1650ms | 42 |
| MiniMax: MiniMax M2 | $0.30 | $1.20 | 1371ms | 71 |
| Qwen: Qwen3 VL 235B A22B Instruct | $0.30 | $1.50 | 2251ms | 18 |
| MiniMax: MiniMax M3 | $0.30 | $1.20 | 1468ms | 45 |
| MiniMax: MiniMax M2.5 | $0.30 | $1.20 | — | — |
| Qwen: Qwen3.5-27B | $0.30 | $2.40 | — | — |
| MiniMax: MiniMax M2.1 | $0.30 | $1.20 | 1459ms | 43 |
| Qwen: Qwen3 235B A22B Thinking 2507 | $0.30 | $3.00 | 476ms | 49 |
| Qwen2.5 72B Instruct | $0.38 | $0.40 | 36831ms | 3 |
| Qwen: Qwen3 Coder 480B A35B | $0.38 | $1.55 | 1389ms | 7 |
| DeepSeek: DeepSeek V3 | $0.40 | $1.30 | 1251ms | 24 |
| Qwen: Qwen3.5-122B-A10B | $0.40 | $3.20 | 880ms | 36 |
| Baidu: ERNIE 4.5 VL 424B A47B | $0.42 | $1.25 | 1411ms | 38 |
| Z.ai: GLM 4.6 (exacto) | $0.44 | $1.76 | 834ms | 119 |
| Meta: Llama 3 70B Instruct | $0.51 | $0.74 | 2071ms | 1 |
| Xiaomi: MiMo-V2.5-Pro | $0.52 | $1.04 | 2509ms | 33 |
| Z.ai: GLM 4.7 | $0.54 | $1.98 | 2585ms | 32 |
| MiniMax: MiniMax M1 | $0.55 | $2.20 | 1265ms | 68 |
| Z.ai: GLM 4.6 | $0.55 | $2.20 | 2238ms | 43 |
| MoonshotAI: Kimi K2.5 | $0.57 | $2.85 | 3839ms | 28 |
| MoonshotAI: Kimi K2 0711 | $0.57 | $2.30 | 1527ms | 13 |
| Z.ai: GLM 4.5V | $0.60 | $1.80 | 2370ms | 37 |
| MoonshotAI: Kimi K2 0905 | $0.60 | $2.50 | 1553ms | 9 |
| Qwen: Qwen3.5 397B A17B | $0.60 | $3.60 | 1391ms | 43 |
| MoonshotAI: Kimi K2 Thinking | $0.60 | $2.50 | 1163ms | 66 |
| WizardLM-2 8x22B | $0.62 | $0.62 | 976ms | 11 |
| DeepSeek: R1 0528 | $0.70 | $2.50 | — | — |
| DeepSeek: R1 | $0.70 | $2.50 | 2011ms | 27 |
| DeepSeek: R1 Distill Llama 70B | $0.80 | $0.80 | 857ms | 23 |
| MoonshotAI: Kimi K2.6 | $0.80 | $3.40 | 1930ms | 49 |
| MoonshotAI: Kimi K2.7 Code | $0.95 | $4.00 | 3117ms | 31 |
| Qwen: Qwen3 VL 235B A22B Thinking | $0.98 | $3.95 | 1293ms | 35 |
| Z.ai: GLM 5 | $1.00 | $3.20 | 3566ms | 22 |
| Z.ai: GLM 5.1 | $1.38 | $4.40 | 2160ms | 35 |
| Z.ai: GLM 5.2 | $1.40 | $4.40 | 1963ms | 26 |
| Sao10K: Llama 3.1 Euryale 70B v2.2 | $1.48 | $1.48 | 500ms | 49 |
| Sao10k: Llama 3 Euryale 70B v2.1 | $1.48 | $1.48 | — | — |
| DeepSeek: DeepSeek V4 Pro | $1.60 | $3.20 | 1235ms | 55 |
Community Reviews
4.5★★★★★(2 reviews)
clouduser42
★★★★★2025-06-15
Reliable service, great API documentation.
mlresearcher
★★★★☆2025-06-10
Good performance but support could be faster.