Dedicated Endpoints
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
Nemotron Reasoning LoRA Adapter
LoRA adapter for nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 / compatible
Nemotron-3-Nano-30B-A3B base model.
Expected files:
adapter_config.jsonadapter_model.safetensors
RunPod download example:
bash
hf download PhuQuy23TNT1/nemotron-reasoning-lora-adapter \--local-dir /workspace/nemotron-reasoning-lora-adapter
Rollout example:
bash
python offline/sample_rollouts.py \--model_path unsloth/Nemotron-3-Nano-30B-A3B \--adapter_path /workspace/nemotron-reasoning-lora-adapter \--mode probe \--output /workspace/rollouts.jsonl
Model provider
PhuQuy23TNT1
Model tree
Base
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16
Adapter
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information