⚡ Hit your SLA, cut costs. Download the Friendli Guide to Inference Performance Optimization ➜

Product

Model APIsDedicated EndpointsContainerWhy FriendliAI

Solutions

CodingAgentsChatbotsSemantic SearchVisual UnderstandingAudio & Voice Analysis
Models

Developers

DocsBlogResearch
Customers

Company

About UsPartnersNewsCareersPatentsBrand ResourcesTrust CenterContact Us
Pricing
HIPAA ComplianceAICPA SOC 2® Type II

SOC 2® Type II

Contact us:

contact@friendli.ai

FriendliAI Corp:

San Francisco, CA

Hub:

Seoul, Korea

HIPAA ComplianceAICPA SOC 2® Type II

SOC 2® Type II

Privacy PolicyService Level AgreementTerms of ServiceCA Notice

Copyright © 2026 FriendliAI Corp. All rights reserved

  • Models
  • Customers
  • Pricing

570,875 Models Available

Featured models

All models

570,875 results found

Model Name

Input

Output

Type

Hcompany

Hcompany

Holo3-35B-A3B

Fine-tuned

Deploy

NousResearch

NousResearch

Hermes-4.3-36B

Fine-tuned

Deploy

bytedance-research

bytedance-research

UI-TARS-7B-DPO

Base

Deploy

meta-llama

meta-llama

Meta-Llama-3-8B

Base

Deploy

Qwen

Qwen

Qwen2.5-1.5B-Instruct

Fine-tuned

Deploy

Qwen

Qwen

Qwen2.5-7B-Instruct

Fine-tuned

Deploy

Qwen

Qwen

Qwen2.5-VL-7B-Instruct

Base

Deploy

mistralai

mistralai

Mistral-7B-Instruct-v0.3

Fine-tuned

Deploy

Hcompany

Hcompany

Holo-3.1-0.8B

Fine-tuned

Deploy

0xSero

DeepSeek-V4-Flash-180B

Quantized

Deploy

Qwen

Qwen

WebWorld-8B

Fine-tuned

Deploy

0xSero

MiniMax-M2.1-REAP-25

Quantized

Deploy

kpsss34

kpsss34

FHDR_Uncensored

Quantized

Deploy

black-forest-labs

black-forest-labs

FLUX.1-Kontext-dev

Base

Deploy

Qwen

Qwen

Qwen2.5-3B-Instruct

Fine-tuned

Deploy

google

google

gemma-7b

Base

Deploy

Qwen

Qwen

Qwen2.5-Coder-32B-Instruct

Fine-tuned

Deploy

0xSero

DeepSeek-V4-Flash-162B

Quantized

Deploy

nvidia

nvidia

Nemotron-3-Nano-Omni-30B-A3B-Reasoning-NVFP4

Quantized

Deploy

ZJU-AI4H

Hulu-Med-Flash-Preview-27B

Base

Deploy

datalab-to

datalab-to

chandra-ocr-2

Base

Deploy

Qwen

Qwen

Qwen3.5-0.8B

Fine-tuned

Deploy

nvidia

nvidia

NVIDIA-Nemotron-3-Nano-30B-A3B-BF16

Base

Deploy

Qwen

Qwen

Qwen3-4B-Instruct-2507

Base

Deploy

microsoft

microsoft

Phi-4-mini-instruct

Base

Deploy

SupraLabs

Supra-50M-Reasoning

Fine-tuned

Deploy

openbmb

openbmb

SciCore-Omics

Base

Deploy

Jackrong

Qwopus3.6-27B-v2

Base

Deploy

Qwen

Qwen

Qwen3.6-35B-A3B-FP8

Quantized

Deploy

RohitUltimate

Qwen3.5_VL_2B_12k

Fine-tuned

Deploy

google

google

gemma-4-26B-A4B

Base

Deploy

rednote-dots-ocr-community

dots.ocr-1.5

Base

Deploy

nvidia

nvidia

NVIDIA-Nemotron-3-Super-120B-A12B-BF16

Base

Deploy

ZJU-AI4H

Hulu-Med-235A22

Base

Deploy

Kbenkhaled

Qwen3.5-27B-NVFP4

Quantized

Deploy

ZJU-AI4H

Hulu-Med-30A3

Base

Deploy

Qwen

Qwen

Qwen3.5-122B-A10B

Base

Deploy

Qwen

Qwen

Qwen3.5-397B-A17B

Base

Deploy

google

google

functiongemma-270m-it

Base

Deploy

meta-llama

meta-llama

Llama-3.2-3B

Base

Deploy

HuggingFaceTB

HuggingFaceTB

SmolLM2-135M-Instruct

Quantized

Deploy

Qwen

Qwen

Qwen2.5-0.5B

Base

Deploy

Load more models