⚡ Hit your SLA, cut costs. Download the Friendli Guide to Inference Performance Optimization ➜

Product

Model APIs Dedicated Endpoints Container Why FriendliAI

Solutions

Coding Agents Chatbots Semantic Search Visual Understanding Audio & Voice Analysis

Developers

Docs Blog Research

Company

About Us Partners News Careers Patents Brand Resources Trust Center Contact Us

HIPAA Compliance

AICPA SOC 2® Type II

SOC 2® Type II

Contact us:

contact@friendli.ai

FriendliAI Corp:

San Francisco, CA

Hub:

Seoul, Korea

SOC 2® Type II

Privacy Policy Service Level Agreement Terms of Service CA Notice

Copyright © 2026 FriendliAI Corp. All rights reserved

Models
Customers
Pricing

Open Models, Ready for Production

Run 581,188 Open Models on the Frontier Inference Cloud.

Featured models

All models

7,984 results found

Model Name

Input

Output

Type

Doomate

mark_5B-A3B

Base

Deploy

amd

MiniMax-M3-MXFP4-AttnFP8

Quantized

Deploy

nwzjk

MiniMax-M3-MXFP4

Quantized

Deploy

Civitai

Qwen-Image-Bench-FP8

Quantized

Deploy

BOOOMJIAO

Qwable-v1

Fine-tuned

Deploy

Tok331102

Qwen3.6-35B-A3B

Base

Deploy

GestaltLabs

Ornstein-3.5-9B-V2

Fine-tuned

Deploy

Zaynoid

Med-3.5-9B-EBOS-v2

Fine-tuned

Deploy

tawkeed-sa

tawkeed-gpt

Fine-tuned

Deploy

Tooony133

Qwen-3.6-27B-v2

Base

Deploy

onda

ligature-seam-gemma4

Adapter

Deploy

LaniakeaPH

chandra-ocr-2

Base

Deploy

Firefly77

gemma-4-12B-it

Fine-tuned

Deploy

hirundo-io

Qwen3.5-4B-restrictions-removed-lora

Adapter

Deploy

alvarobartt

Qwen3.5-4B-FT

Fine-tuned

Deploy

sahilchachra

Qwable-v1-NVFP4A16

Quantized

Deploy

LuciaValentine

origin_gemma4_12b_exllama3_2.0bpw

Quantized

Deploy

LuciaValentine

origin_gemma4_12b_exllama3_4.0bpw

Quantized

Deploy

PhYen

OCR-AVD

Fine-tuned

Deploy

LuciaValentine

origin_gemma4_12b_exllama3_6.0bpw

Quantized

Deploy

LuciaValentine

origin_gemma4_12b_exllama3_8.0bpw

Quantized

Deploy

mattbucci

Nemotron-3-Nano-Omni-30B-A3B-Reasoning-AWQ

Quantized

Deploy

tamewild

4b_v225_merged_e5

Base

Deploy

azazeal2

Qwoble3.5_4B

Fine-tuned

Deploy

EpistemeAI

Reasoning-Medical-27B

Fine-tuned

Deploy

srv-sngh

gemma-4-12B-coder-fable5-composer2.5-nvfp4

Quantized

Deploy

efficiencyx

Jun-FP16-138s

Fine-tuned

Deploy

quockhangdev

CoTuGRM-2.5.T1-DFT-5E5

Base

Deploy

azazeal2

Qwoble

Fine-tuned

Deploy

quockhangdev

CoTuGRM-2.5.T1-SFT-2E4

Base

Deploy

quockhangdev

CoTuGRM-2.5.T1-DFT-2E4

Base

Deploy

usermma

Qwable-v1-mlx-8Bit

Quantized

Deploy

usermma

Qwable-v1-mlx-5Bit

Quantized

Deploy

usermma

Qwable-v1-mlx-6Bit

Quantized

Deploy

usermma

Qwable-v1-mlx-4Bit

Quantized

Deploy

usermma

Qwable-v1-mlx-3Bit

Quantized

Deploy

usermma

Qwable-v1-mlx-2Bit

Quantized

Deploy

usermma

Qwable-v1-mlx-fp16

Fine-tuned

Deploy

MagistrTheOne

SHUTEN-DOJI

Fine-tuned

Deploy

amd

MiniMax-M3-MXFP4

Quantized

Deploy

usermma

Darwin-28B-Coder-mlx-8Bit

Quantized

Deploy

usermma

Darwin-28B-Coder-mlx-6Bit

Quantized

Deploy

Load more models