GET
/
dedicated
/
beta
/
endpoint
/
{endpoint_id}
curl --request GET \
  --url https://api.friendli.ai/dedicated/beta/endpoint/{endpoint_id} \
  --header 'Authorization: Bearer <token>'
{
  "name": "endpoint-name",
  "gpuType": "NVIDIA H100",
  "numGpu": 1,
  "instanceId": "instance-id",
  "projectId": "project-id",
  "creatorId": "creator-id",
  "teamId": "team-id",
  "autoscalingMin": 0,
  "autoscalingMax": 1,
  "autoscalingCooldown": 300,
  "maxBatchSize": 10,
  "maxInputLength": 1024,
  "tokenizerSkipSpecialTokens": true,
  "tokenizerAddSpecialTokens": true,
  "currReplicaCnt": 1,
  "desiredReplicaCnt": 1,
  "updatedReplicaCnt": 1
}

To request successfully, it is mandatory to enter a Friendli Token (e.g. flp_XXX) value in the Bearer Token field. Refer to the authentication section on our introduction page to learn how to acquire this variable and visit here to generate your token.

This API is currently in Beta. While we strive to provide a stable and reliable experience, this feature is still under active development. As a result, you may encounter unexpected behavior or limitations. We encourage you to provide feedback to help us improve the feature before its official release.

Authorizations

Authorization
string
header
required

When using Friendli Suite API for inference requests, you need to provide a Friendli Token for authentication and authorization purposes.

For more detailed information, please refer here.

Headers

X-Friendli-Team
string | null

ID of team to run requests as (optional parameter).

Path Parameters

endpoint_id
string
required

The ID of the endpoint

Response

200
application/json
Successfully retrieved the endpoint specification.

Dedicated endpoint specification.

autoscalingCooldown
integer
required

The cooldown period in seconds between scaling operations.

autoscalingMax
integer
required

The maximum number of replicas allowed.

autoscalingMin
integer
required

The minimum number of replicas to maintain.

creatorId
string
required

The ID of the user who created the endpoint.

currReplicaCnt
integer
required

The current number of replicas.

desiredReplicaCnt
integer
required

The desired number of replicas.

gpuType
string
required

The type of GPU to use for the endpoint.

maxBatchSize
integer
required

The maximum batch size for inference requests.

name
string
required

The name of the endpoint.

numGpu
integer
required

The number of GPUs to use per replica.

projectId
string
required

The ID of the project that owns the endpoint.

teamId
string
required

The ID of the team that owns the endpoint.

tokenizerAddSpecialTokens
boolean
required

Whether to add special tokens in tokenizer input.

tokenizerSkipSpecialTokens
boolean
required

Whether to skip special tokens in tokenizer output.

updatedReplicaCnt
integer
required

The updated number of replicas.

instanceId
string | null

The ID of the instance.

maxInputLength
integer | null

The maximum allowed input length.