Dedicated Endpoints
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0~3.55BPW custom optimized EXL3 quant of Nex-N2-Pro 397B.
markdown
-- A perplexity: 3.27204336-- B perplexity: 3.28824294-- A label in top-K:K = 1: 0.7131K = 2: 0.8113K = 3: 0.8530K = 4: 0.8768K = 5: 0.8925-- B label in top-K:K = 1: 0.7117K = 2: 0.8106K = 3: 0.8526K = 4: 0.8761K = 5: 0.8922-- Top-K agreement, A vs B:K = 1: 0.9551K = 2: 0.8190K = 3: 0.6442K = 4: 0.4761K = 5: 0.3361-- KL divergence (A, B): 0.02391939-- KL divergence (B, A): 0.02346712
Model provider
cpral
Model tree
Base
nex-agi/Nex-N2-Pro
Quantized
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information