⚡ Hit your SLA, cut costs. Download the Friendli Guide to Inference Performance Optimization ➜

Product

Model APIsDedicated EndpointsContainerWhy FriendliAI

Solutions

CodingAgentsChatbotsSemantic SearchVisual UnderstandingAudio & Voice Analysis
Models

Developers

DocsBlogResearch
Customers

Company

About UsPartnersNewsCareersPatentsBrand ResourcesTrust CenterContact Us
Pricing
HIPAA ComplianceAICPA SOC 2® Type II

SOC 2® Type II

Contact us:

contact@friendli.ai

FriendliAI Corp:

San Francisco, CA

Hub:

Seoul, Korea

HIPAA ComplianceAICPA SOC 2® Type II

SOC 2® Type II

Privacy PolicyService Level AgreementTerms of ServiceCA Notice

Copyright © 2026 FriendliAI Corp. All rights reserved

  • Models
  • Customers
  • Pricing

Open Models, Ready for Production

Run 581,188 Open Models on the Frontier Inference Cloud.

Featured models

All models

7,984 results found

Model Name

Input

Output

Type

Doomate

mark_5B-A3B

Base

Deploy

amd

amd

MiniMax-M3-MXFP4-AttnFP8

Quantized

Deploy

nwzjk

MiniMax-M3-MXFP4

Quantized

Deploy

Civitai

Civitai

Qwen-Image-Bench-FP8

Quantized

Deploy

BOOOMJIAO

Qwable-v1

Fine-tuned

Deploy

Tok331102

Qwen3.6-35B-A3B

Base

Deploy

GestaltLabs

Ornstein-3.5-9B-V2

Fine-tuned

Deploy

Zaynoid

Zaynoid

Med-3.5-9B-EBOS-v2

Fine-tuned

Deploy

tawkeed-sa

tawkeed-gpt

Fine-tuned

Deploy

Tooony133

Qwen-3.6-27B-v2

Base

Deploy

onda

ligature-seam-gemma4

Adapter

Deploy

LaniakeaPH

chandra-ocr-2

Base

Deploy

Firefly77

gemma-4-12B-it

Fine-tuned

Deploy

hirundo-io

hirundo-io

Qwen3.5-4B-restrictions-removed-lora

Adapter

Deploy

alvarobartt

alvarobartt

Qwen3.5-4B-FT

Fine-tuned

Deploy

sahilchachra

Qwable-v1-NVFP4A16

Quantized

Deploy

LuciaValentine

origin_gemma4_12b_exllama3_2.0bpw

Quantized

Deploy

LuciaValentine

origin_gemma4_12b_exllama3_4.0bpw

Quantized

Deploy

PhYen

OCR-AVD

Fine-tuned

Deploy

LuciaValentine

origin_gemma4_12b_exllama3_6.0bpw

Quantized

Deploy

LuciaValentine

origin_gemma4_12b_exllama3_8.0bpw

Quantized

Deploy

mattbucci

Nemotron-3-Nano-Omni-30B-A3B-Reasoning-AWQ

Quantized

Deploy

tamewild

tamewild

4b_v225_merged_e5

Base

Deploy

azazeal2

Qwoble3.5_4B

Fine-tuned

Deploy

EpistemeAI

EpistemeAI

Reasoning-Medical-27B

Fine-tuned

Deploy

srv-sngh

gemma-4-12B-coder-fable5-composer2.5-nvfp4

Quantized

Deploy

efficiencyx

Jun-FP16-138s

Fine-tuned

Deploy

quockhangdev

CoTuGRM-2.5.T1-DFT-5E5

Base

Deploy

azazeal2

Qwoble

Fine-tuned

Deploy

quockhangdev

CoTuGRM-2.5.T1-SFT-2E4

Base

Deploy

quockhangdev

CoTuGRM-2.5.T1-DFT-2E4

Base

Deploy

usermma

Qwable-v1-mlx-8Bit

Quantized

Deploy

usermma

Qwable-v1-mlx-5Bit

Quantized

Deploy

usermma

Qwable-v1-mlx-6Bit

Quantized

Deploy

usermma

Qwable-v1-mlx-4Bit

Quantized

Deploy

usermma

Qwable-v1-mlx-3Bit

Quantized

Deploy

usermma

Qwable-v1-mlx-2Bit

Quantized

Deploy

usermma

Qwable-v1-mlx-fp16

Fine-tuned

Deploy

MagistrTheOne

SHUTEN-DOJI

Fine-tuned

Deploy

amd

amd

MiniMax-M3-MXFP4

Quantized

Deploy

usermma

Darwin-28B-Coder-mlx-8Bit

Quantized

Deploy

usermma

Darwin-28B-Coder-mlx-6Bit

Quantized

Deploy

Load more models