Rate Limits

Friendli Model APIs apply rate limits based on your usage tier. Higher tiers unlock higher request rates and output length limits.

Tier-Based API Rate Limits

Tiers are based on lifetime spending and update automatically. As your lifetime spend grows, your tier increases. You can move up instantly by purchasing additional credits.

Adaptive Rate Limits: Rate limits are applied dynamically based on overall platform conditions.

‘Output Token Length’ is how much the model can write in response. It’s different from ‘Context Length’, which is the sum of the input and output tokens.

Last modified on July 13, 2026

Models and Pricing

Introducing Friendli Dedicated Endpoints

⌘I

Introduction

Capabilities

Friendli Model APIs

Friendli Dedicated Endpoints

Friendli Container

Friendli Suite Guide

Tier-Based API Rate Limits

​Tier-Based API Rate Limits

Tier-Based API Rate Limits