Power AI at Scale on FriendliAI with NVIDIA Models

High-performance, production-ready NVIDIA AI models
ready to deploy on FriendliAI’s flexible inference platform.

NVIDIA

Nemotron 3 Family

Nano

Nano provides cost efficiency with high accuracy for targeted agentic tasks.

Super

Super delivers high accuracy for multi-agentic reasoning.

Ultra

Ultra is designed for applications demanding the highest reasoning accuracy.

The new Nemotron 3 family provides the most efficient open models, powered by hybrid Mamba‑Transformer MoE with 1M-token context, delivering top accuracy for complex, high-throughput agentic AI applications.

BENEFITS

Why NVIDIA models on FriendliAI

Maximize throughput. Minimize cost. Accelerate innovation.

Fast inference

Optimized kernels and hardware utilization.

Accurate outputs

Leverage NVIDIA’s advanced model architectures.

Secure & reliable

Enterprise-grade compliance and uptime.

Scalable on demand

Burst or steady workloads with cost predictability.

FriendliAI hosts a curated suite of NVIDIA-optimized AI models so you can hit production faster with industry-leading performance. Whether you’re building large-scale LLM applications, embedding search, or fine-tuning domain-specific intelligence — our platform delivers the compute and tooling you need without infrastructure complexity.

Access the model you want

Fully hosted Nemotron models on FriendliAI.

Try the NEW NVIDIA Nemotron 3 Super with FriendliAI