Pricing
Friendli Dedicated Endpoints
Build and run generative AI models on autopilot
Basic
Sign upFeatured highlights
Get $10 in free credits upon sign up
Build and run generative AI models on autopilot
Configurable autoscaling
Test your endpoints in the playground
Billed monthly
Enterprise
Contact SalesFeatured highlights
Advanced features
Priority access to high-demand GPUs, including A100s and H100s
Monitor endpoints with Metrics & Logs
Dedicated support
Custom pricing
Pricing details
Endpoint
GPU Type
$ / hour
Friendli on A100 80GB
$3.8
Friendli on H100 80GB
$7.6
Fine-tuning
Model
$ / 1M tokens
Models up to 16B parameters
$0.50
Models 16.1B - 72B
$3.00
* We charge based on the total number of tokens processed by your fine-tuning jobs.
Friendli Container
Serve LLMs/LMMs inferences with Friendli Engine in your GPU environment
Trial
Sign upEnjoy a 60-day free trial of Friendli Container to serve your LLM in your development environment.
Enterprise
Contact SalesFeatured highlights
Contact us to use Friendli Container in your production environment.
Friendli Serverless Endpoints
Call fast and affordable API for open-source generative AI models
Free trial
Sign upSign up and get $5 in free trial credits!
Basic
Sign upFeatured highlights
Inference models in Chat application
Pricing details
Model code
Price per unit
Mixtral-8x7B-Instruct-v0.1
$0.4/1M tokens
Meta-Llama-3.1-8B-Instruct
$0.1/1M tokens
Meta-Llama-3.1-70B-Instruct
$0.6/1M tokens