Power AI at Scale on FriendliAI with NVIDIA Models
High-performance, production-ready NVIDIA AI models ready to deploy on FriendliAI’s flexible inference platform.


Nemotron 3 Family
Nano
Nano provides cost efficiency with high accuracy for targeted agentic tasks.
Super
Super delivers high accuracy for multi-agentic reasoning.
Ultra
Ultra is designed for applications demanding the highest reasoning accuracy.
The new Nemotron 3 family provides the most efficient open models, powered by hybrid Mamba‑Transformer MoE with 1M-token context, delivering top accuracy for complex, high-throughput agentic AI applications.
BENEFITS
Why NVIDIA models on FriendliAI
Maximize throughput. Minimize cost. Accelerate innovation.
Fast inference
Optimized kernels and hardware utilization.
Accurate outputs
Leverage NVIDIA’s advanced model architectures.
Secure & reliable
Enterprise-grade compliance and uptime.
Scalable on demand
Burst or steady workloads with cost predictability.
FriendliAI hosts a curated suite of NVIDIA-optimized AI models so you can hit production faster with industry-leading performance. Whether you’re building large-scale LLM applications, embedding search, or fine-tuning domain-specific intelligence — our platform delivers the compute and tooling you need without infrastructure complexity.
Access the model you want
Fully hosted Nemotron models on FriendliAI.