cpral

Nex-N2-Pro-EXL3-2BPW

Deploy Dedicated

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

2BPW EXL3 quant of Nex-N2-Pro 397B.

markdown
-- A perplexity:  3.27149317
 -- B perplexity:  3.62310984
 -- A label in top-K:
      K = 1: 0.7133
      K = 2: 0.8114
      K = 3: 0.8532
      K = 4: 0.8767
      K = 5: 0.8924
 -- B label in top-K:
      K = 1: 0.6874
      K = 2: 0.7958
      K = 3: 0.8399
      K = 4: 0.8663
      K = 5: 0.8840
 -- Top-K agreement, A vs B:
      K = 1: 0.8709
      K = 2: 0.5901
      K = 3: 0.3424
      K = 4: 0.1836
      K = 5: 0.0948
 -- KL divergence (A, B):  0.22048885
 -- KL divergence (B, A):  0.16824008