Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
Nemotron Super 49B AIE v11 5500 Merged BF16
This is a merged bf16 export of nvidia/Llama-3_3-Nemotron-Super-49B-v1_5 with the local AIE v11 LoRA adapter merged into the base weights.
Training summary:
- Dataset:
aie_v11, 5,227 examples. - Stage 1: 4-bit LoRA SFT at
cutoff_len=4096, 2 epochs. - Stage 2: continued from the stage 1 adapter at
cutoff_len=5500, 2 epochs. - Stage 2 learning rate:
3e-5. - Stage 2 final train loss:
0.1407. - LoRA target modules:
q_proj,v_proj. - Export dtype:
bfloat16.
The merged model was exported with LLaMA-Factory and split into 21 safetensors shards.
Model provider
SiddharthaChekuri
Model tree
Base
nvidia/Llama-3_3-Nemotron-Super-49B-v1_5
Fine-tuned
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information