POST
/
dedicated
/
beta
/
endpoint
curl --request POST \
  --url https://api.friendli.ai/dedicated/beta/endpoint \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "advanced": {
    "tokenizer_add_special_tokens": true,
    "tokenizer_skip_special_tokens": true,
    "enable_content_logging": true,
    "max_batch_size": 256,
    "max_input_length": 123,
    "max_token_count": 2560
  },
  "hfModelRepo": "<string>",
  "instanceOptionId": "<string>",
  "name": "<string>",
  "projectId": "<string>",
  "autoscalingPolicy": {
    "cooldownPeriod": 300,
    "maxReplica": 1,
    "minReplica": 0
  },
  "hfModelRepoRevision": "<string>",
  "initialVersionComment": "<string>",
  "simplescale": {
    "replicas": 2
  }
}'
{
  "status": "INITIALIZING",
  "createdAt": "2025-01-01T00:00:00Z",
  "updatedAt": "2025-01-01T00:00:00Z",
  "phase": "DOWNLOADING_MODEL"
}

To request successfully, it is mandatory to enter a Friendli Token (e.g. flp_XXX) value in the Bearer Token field. Refer to the authentication section on our introduction page to learn how to acquire this variable and visit here to generate your token.

This API is currently in Beta. While we strive to provide a stable and reliable experience, this feature is still under active development. As a result, you may encounter unexpected behavior or limitations. We encourage you to provide feedback to help us improve the feature before its official release.

Authorizations

Authorization
string
header
required

When using Friendli Suite API for inference requests, you need to provide a Friendli Token for authentication and authorization purposes.

For more detailed information, please refer here.

Headers

X-Friendli-Team
string | null

ID of team to run requests as (optional parameter).

Body

application/json

Dedicated endpoint create request.

advanced
object
required

The advanced configuration of the endpoint.

hfModelRepo
string
required

HF ID of the model.

instanceOptionId
string
required

The ID of the instance option.

name
string
required

The name of the endpoint.

projectId
string
required

The ID of the project that owns the endpoint.

autoscalingPolicy
object | null

The auto scaling configuration of the endpoint.

hfModelRepoRevision
string | null

HF commit hash of the model.

initialVersionComment
string | null

The comment for the initial version.

simplescale
object | null

The simple scaling configuration of the endpoint.

Response

200
application/json
Successfully created the endpoint.

Dedicated endpoint status.

createdAt
string
required

When the endpoint was created.

status
enum<string>
required

The current status of the endpoint deployment.

Available options:
UNKNOWN,
INITIALIZING,
RUNNING,
UPDATING,
SLEEPING,
AWAKING,
FAILED,
STOPPING,
TERMINATING,
TERMINATED,
READY
errorCode
enum<string> | null

Error code if deployment failed.

Available options:
WORKLOAD_INIT_UNKNOWN_ERROR,
WORKLOAD_INIT_SETTINGS_ERROR,
WORKLOAD_INIT_GRPC_ERROR,
WORKLOAD_INIT_MANIFEST_NOT_FOUND_ERROR,
WORKLOAD_INIT_MANIFEST_TYPE_ERROR,
WORKLOAD_INIT_DOWNLOAD_ERROR,
WORKLOAD_INIT_INVALID_TOKEN_ERROR,
WORKLOAD_INIT_CANNOT_ACCESS_REPO_ERROR,
WORKLOAD_INIT_HF_WANDB_API_ERROR,
WORKLOAD_INIT_INSUFFICIENT_DISK_ERROR,
INFERENCE_ENGINE_UNKNOWN_ERROR,
INFERENCE_ENGINE_INVALID_ARGUMENT_ERROR,
INFERENCE_ENGINE_MEMORY_ERROR,
INFERENCE_ENGINE_METERING_CLIENT_CONFIG_ERROR
phase
enum<string> | null

The current phase of the endpoint.

Available options:
REQUESTING_VIRTUAL_MACHINE,
DOWNLOADING_MODEL,
ENGINE_INITIALIZING
updatedAt
string | null

When the endpoint was last updated.