Dedicated Create Endpoint
Create a Friendli Dedicated Endpoint deployment for a Hugging Face model via the API. Specify GPU type, replica count, and model configuration.
Authorizations
Headers
ID of team to run requests as (optional parameter).
Body
Dedicated endpoint create request.
The ID of the project that owns the endpoint.
The name of the endpoint.
The ID of the instance option.
Available options:
- 1x NVIDIA A100 80GB:
ShbPuOs4tfGb - 2x NVIDIA A100 80GB:
mrAHuYt7T40o - 4x NVIDIA A100 80GB:
JkNob0NMdoF3 - 8x NVIDIA A100 80GB:
sYH4kHmAcA5P - 1x NVIDIA H100:
TwD5AqnBSVN0 - 2x NVIDIA H100:
zfTutSiLn0Hq - 4x NVIDIA H100:
lfkRz5G48REc - 8x NVIDIA H100:
GUA4qYFmsYz8 - 1x NVIDIA H200:
LnK1wTaKc7WO - 2x NVIDIA H200:
Tu6GjBnfHPe4 - 4x NVIDIA H200:
OhTzYtZuomzI - 8x NVIDIA H200:
ahBzWtOuomsI - 1x NVIDIA B200:
8GiQTLKfJNOr - 2x NVIDIA B200:
brTZGIuYgVrs - 4x NVIDIA B200:
AFoZMFXZnAdD - 8x NVIDIA B200:
drbc6G9FxJWZ
The advanced configuration of the endpoint.
HF ID of the model.
Simple scaling options.
Autoscaling policy.
HF commit hash of the model.
The comment for the initial version.
Response
Successfully created the endpoint.
Dedicated endpoint status.
The current status of the endpoint deployment.
UNKNOWN, INITIALIZING, RUNNING, UPDATING, SLEEPING, AWAKING, FAILED, STOPPING, TERMINATING, TERMINATED, READY When the endpoint was created.
ErrorCode type.
WORKLOAD_INIT_UNKNOWN_ERROR, WORKLOAD_INIT_SETTINGS_ERROR, WORKLOAD_INIT_GRPC_ERROR, WORKLOAD_INIT_MANIFEST_NOT_FOUND_ERROR, WORKLOAD_INIT_MANIFEST_TYPE_ERROR, WORKLOAD_INIT_DOWNLOAD_ERROR, WORKLOAD_INIT_INVALID_TOKEN_ERROR, WORKLOAD_INIT_CANNOT_ACCESS_REPO_ERROR, WORKLOAD_INIT_HF_WANDB_API_ERROR, WORKLOAD_INIT_INSUFFICIENT_DISK_ERROR, INFERENCE_ENGINE_UNKNOWN_ERROR, INFERENCE_ENGINE_INVALID_ARGUMENT_ERROR, INFERENCE_ENGINE_MEMORY_ERROR, INFERENCE_ENGINE_METERING_CLIENT_CONFIG_ERROR When the endpoint was last updated.
The current phase of the endpoint.
REQUESTING_VIRTUAL_MACHINE, DOWNLOADING_MODEL, ENGINE_INITIALIZING