FriendliAI Pricing

Friendli Dedicated Endpoints

Build and run generative AI models on autopilot

Basic

Sign up

Featured highlights

Get $10 in free credits upon sign up

Build and run generative AI models on autopilot

Configurable autoscaling

Test your endpoints in the playground

Billed monthly

Enterprise

Contact Sales

Featured highlights

Advanced features

Priority access to high-demand GPUs, including A100s and H100s

Monitor endpoints with Metrics & Logs

Dedicated support

Custom pricing

Pricing details

Endpoint

GPU Type

$ / hour

Friendli on A100 80GB

$3.8

Friendli on H100 80GB

$7.6

Fine-tuning

Model

$ / 1M tokens

Models up to 16B parameters

$0.50

Models 16.1B - 72B

$3.00

* We charge based on the total number of tokens processed by your fine-tuning jobs.

Friendli Container

Serve LLM and LMM inferences with Friendli Engine in your GPU environment

Trial

Sign up

Enjoy a 60-day free trial of Friendli Container to serve your LLM in your development environment.

Enterprise

Contact Sales

Featured highlights

Contact us to use Friendli Container in your production environment.

Friendli Serverless Endpoints

Call our fast and affordable API for open-source generative AI models

Free trial

Sign up

Sign up and get $5 in free trial credits!

Basic

Sign up

Featured highlights

Inference models in Chat application

Pricing details

Model code

Price per unit

Llama 3.1 8B Instruct

$0.1/1M tokens

Llama 3.1 70B Instruct

$0.6/1M tokens

Mixtral 8x7B Instruct v0.1

$0.4/1M tokens