Skip to main content
Friendli Model APIs apply rate limits based on your usage tier. Higher tiers unlock higher request rates and output length limits.

Tier-based API rate limits

Tiers are based on lifetime spending and update automatically. As your lifetime spend grows, your tier increases. You can move up instantly by purchasing additional credits.
Adaptive Rate Limits: Rate limits are applied dynamically based on overall platform conditions.
‘Output Token Length’ is how much the model can write in response. It’s different from ‘Context Length’, which is the sum of the input and output tokens.
Last modified on June 22, 2026