Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

~3.4BPW custom optimized EXL3 quant of Nex-N2-Pro 397B.

markdown

-- A perplexity: 3.27273603
-- B perplexity: 3.29885435
-- A label in top-K:
K = 1: 0.7132
K = 2: 0.8113
K = 3: 0.8533
K = 4: 0.8765
K = 5: 0.8924
-- B label in top-K:
K = 1: 0.7110
K = 2: 0.8106
K = 3: 0.8527
K = 4: 0.8758
K = 5: 0.8921
-- Top-K agreement, A vs B:
K = 1: 0.9508
K = 2: 0.8034
K = 3: 0.6177
K = 4: 0.4464
K = 5: 0.3076
-- KL divergence (A, B): 0.02929534
-- KL divergence (B, A): 0.02823736

Model provider

cpral

Model tree

Base

nex-agi/Nex-N2-Pro

Quantized

this model

Modalities

Input

Video, Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today