Skip to main content
Friendli Dedicated Endpoints let you deploy custom or open-source AI models on GPU instances dedicated entirely to you—for consistent performance, full control over your deployment with no infrastructure to manage.

What Are Friendli Dedicated Endpoints

Each endpoint runs your model on its own GPU instance. With Friendli Dedicated Endpoints, you can:
  • Bring your own model: Run your own model or deploy any model from Hugging Face.
  • Choose dedicated resources: Select the GPU type for your workload. Each instance is fully dedicated to your model—no shared resources.
  • Scale reliably: Trusted by leading companies for robust performance on production workloads.
  • Pay per second: You are billed only for the time your model runs. See Pricing for details.

Next Steps

QuickStart

Deploy your first endpoint.

Pricing

Review per-second pricing.

Browse Models

Explore available models.
Last modified on June 22, 2026