FriendliAI Secures $20M to Accelerate AI Inference Innovation — Read the Full Story

Pricing

Find the best product for you

Dedicated Endpoints

Container

Serverless Endpoints

Friendli Serverless Endpoints

Fast and affordable API for open-source models

Get startedGet custom quote

Compare plans and features

CategoryFeaturesStarterEnterprise
InferenceOpenAI compatible APIs
Optimized inference APIs
Long context (128K) handling
Function calling & JSON mode
Rate limit

Varies with tier

See tier details
Custom
ToolsDocument parsing
Web search
Code interpreter
Other built-in tools

Tiers

See tier details
Minimum usage tier
Tier 1+
Custom
Inference

OpenAI compatible APIs

Starter
Enterprise

Optimized inference APIs

Starter
Enterprise

Long context (128K) handling

Starter
Enterprise

Function calling & JSON mode

Starter
Enterprise

Rate limit

Starter

Varies with tier

See tier details
Enterprise
Custom
Tools

Document parsing

Starter
Enterprise

Web search

Starter
Enterprise

Code interpreter

Starter
Enterprise

Other built-in tools

Starter
Enterprise

Tiers

See tier details

Minimum usage tier

Starter
Tier 1+
Enterprise
Custom

Pricing details

Important Update

Effective June 20, we’ve introduced new billing options and plan changes. Models are now billed Token-based or Time-based, depending on the specific model.

Show details

Token-based Billing

Pinned models (such as DeepSeek, Llama, and other popular models) are charged per token basis. These models are billed based on the number of tokens processed, where a "token" refers to an individual unit processed by the model.

Time-based Billing

You may encounter other models besides the pinned models, and they are charged timely. These models are billed based on the duration of compute time used for inference.

Free

Model name

Free until

Midm-2.0-Base-Instruct

September 4, 2025

Token-based billing

Model name

$ / 1M tokens

EXAONE-4.0.1-32B

Input

$0.6

Output

$1

Llama-3.3-70B-Instruct

$0.6

Llama-3.1-8B-Instruct

$0.1

Time-based billing

Model name

$ / second

A.X-4.0

$0.002

A.X-3.1

$0.002

HyperCLOVAX-SEED-Think-14B

$0.002

DeepSeek-R1-0528

$0.004

Llama-4-Maverick-17B-128E-Instruct

$0.004

Llama-4-Scout-17B-16E-Instruct

$0.002

Qwen3-235B-A22B-Thinking-2507

$0.004

Qwen3-235B-A22B-Instruct-2507

$0.004

Qwen3-30B-A3B

$0.002

Qwen3-32B

$0.002

gemma-3-27b-it

$0.002

Mistral-Small-3.1-24B-Instruct-2503

$0.002

Devstral-Small-2505

$0.002

Magistral-Small-2506

$0.002

Products

Friendli Dedicated EndpointsFriendli Serverless EndpointsFriendli Container

Solutions

InferenceUse Cases
Models

Developers

DocsBlogResearch

Company

About usNewsCareersPatentsBrand ResourcesContact us
Pricing

Contact us:

contact@friendli.ai

FriendliAI Corp:

3 E 3rd Ave #302,
San Mateo, CA 94401

Hub:

5F AMC Tower, 222 Bongeunsa-ro,
Gangnam-gu, Seoul, 06135, Korea

Privacy Policy

Service Level Agreement

Terms of Service

CA Notice

Copyright © 2025 FriendliAI Inc. All rights reserved