Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

Nemotron Super 49B AIE v11 5500 Merged BF16

This is a merged bf16 export of nvidia/Llama-3_3-Nemotron-Super-49B-v1_5 with the local AIE v11 LoRA adapter merged into the base weights.

Training summary:

  • Dataset: aie_v11, 5,227 examples.
  • Stage 1: 4-bit LoRA SFT at cutoff_len=4096, 2 epochs.
  • Stage 2: continued from the stage 1 adapter at cutoff_len=5500, 2 epochs.
  • Stage 2 learning rate: 3e-5.
  • Stage 2 final train loss: 0.1407.
  • LoRA target modules: q_proj,v_proj.
  • Export dtype: bfloat16.

The merged model was exported with LLaMA-Factory and split into 21 safetensors shards.

Model provider

SiddharthaChekuri

Model tree

Base

nvidia/Llama-3_3-Nemotron-Super-49B-v1_5

Fine-tuned

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today