May 14, 2025
3 min read

Explore 370K+ AI Models on FriendliAI's Models Page

We’re excited to launch our new models page, a central place where you can explore and use all the AI models available on our platform. This update makes it easier than ever to find the perfect model for your needs–whether you’re working on text, audio, video, image, or multimodal tasks–and deploy it with just a single click.

Figure 1: FriendliAI’s model page. Reference: FriendliAI. [Online] Available: https://friendli.ai/models [Accessed May 13, 2025]

A Diverse Collection of Generative AI Models

FriendliAI offers diverse model support, including LoRA adapters, and fine-tuned, merged, and quantized models. Here’s a breakdown of the models available, as of May 13, 2025:

Language Models: 360,452 models
Audio Understanding: 9,979 models
Image Understanding: 4,447 models
Image Generation: 29,618 models
Video Understanding: 875 models

Note: These categories represent major model types but may overlap, as many models are multimodal. The total model count exceeds 370,000 and includes models that fall outside these classifications.

You can also deploy your custom models directly from Hugging Face with just one click. Check out our previous blog posts, “Deploy Models from Hugging Face to Friendli Endpoints” and “Deploy Multimodal Models from Hugging Face to FriendliAI with Ease” to learn more.

We are constantly adding new models to our supported list. If you don't see the specific model you need, you can contact us at support@friendli.ai and we’ll add it as soon as we can.

Seamless Integration with Friendli Endpoints

One of the most standout features of the new models page is the ability to view the list of endpoints for each model. This allows you to quickly identify and deploy the models you want, streamlining the process of deploying AI solutions.

Figure 2: Model overview shown upon clicking the model. This component contains—(1) Detailed information of the model; (2) A list of the logged-in user’s Dedicated Endpoints deployed with such model; and (3) A list of its readily available Serverless Endpoints. Reference: FriendliAI. [Online] Available: https://friendli.ai/models [Accessed May 13, 2025]

Friendli Optimized Models

We’ve gone above and beyond to optimize some of the most popular models, delivering exceptional performance and quality through our cutting-edge inference acceleration technology. Experience the next level of performance with FriendliAI.

You can filter by “Optimized” to see these accelerated models, such as:

Figure 3: Popular models optimized by FriendliAI. Reference: FriendliAI. [Online] Available: https://friendli.ai/models/search?optimized=true [Accessed May 13, 2025]

Company	Model
DeepSeek	DeepSeek-R1-Distill-Qwen-1.5B DeepSeek-R1-Distill-Qwen-32B DeepSeek-R1-Dilstill-Llama-8B DeepSeek-R1-Distill-Qwen-7B DeepSeek-R1-Distill-Llama-70B DeepSeek-R1-Distill-Qwen-14B
Meta	Llama-3.3-70B-Instruct Llama-3.2-3B-Instruct Llama-3.1-8B-Instruct Llama-3.1-70B-Instruct
Microsoft	phi-4 Phi-3.5-mini-instruct
Mistral	Mistral-7B-Instruct-v0.3
NVIDIA	Llama-3.1-Nemotron-70B-Instruct
Qwen	Qwen2.5-Coder-32B-Instruct Qwen2.5-72B-Instruct QwQ-32B-Preview

Figure 4: Links to Model Deploy page on Friendli Suite for FriendliAI’s optimized models. Reference: FriendliAI. [Online] Available: https://friendli.ai/models/search?optimized=true [Accessed May 13, 2025]

FriendliAI is at the forefront of fast, efficient, and reliable generative AI inference-time scaling. Recognized as the world’s fastest GPU-based API provider by Artificial Analysis, FriendliAI enables businesses to deploy and serve custom AI models with minimal latency and reduced cost.

By leveraging FriendliAI's technology, companies can harness the power of AI to drive innovation and growth. If you'd like to optimize and deploy your own models for superior performance, don't hesitate to reach out to us. We're here to help you take your services to new heights! Check out our previous blog post “Compress Generative AI Models with Friendli Model Optimizer” to learn more.

Start Exploring Today

Visit our models page to explore our ever-growing list of over 370,000 AI models and revolutionize your projects. Find the perfect model to elevate your work.

Stay tuned for more updates as we continue to expand and enhance our platform! Can't find the model you need? Contact us at support@friendli.ai.

Written by

FriendliAI Tech & Research

General FAQ

What is FriendliAI?

FriendliAI is a GPU-inference platform that lets you deploy, scale, and monitor large language and multimodal models in production, without owning or managing GPU infrastructure. We offer three things for your AI models: Unmatched speed, cost efficiency, and operational simplicity. Find out which product is the best fit for you in here.

How does FriendliAI help my business?

Our Friendli Inference allows you to squeeze more tokens-per-second out of every GPU. Because you need fewer GPUs to serve the same load, the true metric—tokens per dollar—comes out higher even if the hourly GPU rate looks similar on paper. View pricing

Which models and modalities are supported?

Over 380,000 text, vision, audio, and multi-modal models are deployable out of the box. You can also upload custom models or LoRA adapters. Explore models

Can I deploy models from Hugging Face directly?

Yes. A one-click deploy by selecting “Friendli Endpoints” on the Hugging Face Hub will take you to our model deployment page. The page provides an easy-to-use interface for setting up Friendli Dedicated Endpoints, a managed service for generative AI inference. Learn more about our Hugging Face partnership

Still have questions?

If you want a customized solution for that key issue that is slowing your growth, contact@friendli.ai or click Talk to an engineer — our engineers (not a bot) will reply within one business day.