Authorizations
Headers
ID of team to run requests as (optional parameter).
Body
Dedicated endpoint create request.
The ID of the project that owns the endpoint.
The name of the endpoint.
The ID of the instance option.
The advanced configuration of the endpoint.
HF ID of the model.
The simple scaling configuration of the endpoint. Simple scaling options.
The auto scaling configuration of the endpoint. Autoscaling policy.
HF commit hash of the model.
The comment for the initial version.
Response
Successfully created the endpoint.
Dedicated endpoint status.
The current status of the endpoint deployment.
UNKNOWN, INITIALIZING, RUNNING, UPDATING, SLEEPING, AWAKING, FAILED, STOPPING, TERMINATING, TERMINATED, READY When the endpoint was created.
Error code if deployment failed. ErrorCode type.
WORKLOAD_INIT_UNKNOWN_ERROR, WORKLOAD_INIT_SETTINGS_ERROR, WORKLOAD_INIT_GRPC_ERROR, WORKLOAD_INIT_MANIFEST_NOT_FOUND_ERROR, WORKLOAD_INIT_MANIFEST_TYPE_ERROR, WORKLOAD_INIT_DOWNLOAD_ERROR, WORKLOAD_INIT_INVALID_TOKEN_ERROR, WORKLOAD_INIT_CANNOT_ACCESS_REPO_ERROR, WORKLOAD_INIT_HF_WANDB_API_ERROR, WORKLOAD_INIT_INSUFFICIENT_DISK_ERROR, INFERENCE_ENGINE_UNKNOWN_ERROR, INFERENCE_ENGINE_INVALID_ARGUMENT_ERROR, INFERENCE_ENGINE_MEMORY_ERROR, INFERENCE_ENGINE_METERING_CLIENT_CONFIG_ERROR When the endpoint was last updated.
The current phase of the endpoint.
REQUESTING_VIRTUAL_MACHINE, DOWNLOADING_MODEL, ENGINE_INITIALIZING