Power AI at Scale on FriendliAI with NVIDIA Models

High-performance, production-ready NVIDIA AI models ready to deploy on FriendliAI’s highly optimized Frontier AI Inference Cloud.

Nemotron 3 Family

Nano

Nano provides cost efficiency with high accuracy for targeted agentic tasks.

Nano Omni

Nemotron 3 Nano Omni is an open mulitmodal reasoning model designed for production agentic AI.

Super

Super delivers high accuracy for multi-agentic reasoning.

Ultra

Ultra is designed for applications demanding the highest reasoning accuracy.

The new Nemotron 3 family provides the most efficient open models, powered by hybrid Mamba‑Transformer MoE with 1M-token context, delivering top accuracy for complex, high-throughput agentic AI applications.

BENEFITS

Why NVIDIA models on FriendliAI

Maximize throughput. Minimize cost. Accelerate innovation.

Fast inference

Optimized kernels and hardware utilization.

Accurate outputs

Leverage NVIDIA’s advanced model architectures.

Secure & reliable

Enterprise-grade compliance and uptime.

Scalable on demand

Burst or steady workloads with cost predictability.

FriendliAI hosts a curated suite of NVIDIA-optimized AI models so you can hit production faster with industry-leading performance. Whether you’re building large-scale LLM applications, embedding search, or fine-tuning domain-specific intelligence — our platform delivers the compute and tooling you need without infrastructure complexity.

Access the model you want

Fully hosted Nemotron models on FriendliAI.

Deploy NVIDIA Nemotron 3 Models on FriendliAI

Get started Talk to an engineer

Explore FriendliAI today

Get started Talk to an engineer