Integrations
How do I integrate a Hugging Face account?
How do I integrate a Hugging Face account?
- Log in to Hugging Face, then navigate to Access Tokens.
- Create a new token. You may use a fine-grained token. In this case, please make sure the token has view permission for the repository you’d like to use.
- Integrate the key in Friendli Suite > Personal Settings > Integrations.
If you revoke / invalidate the key, you will have to update the key in order to not disrupt ongoing deployments, or to launch a new inference deployment.
Using a 3rd-party model
How can I use a Hugging Face repository as a model?
How can I use a Hugging Face repository as a model?

- Use the repository id of the model. You may select the entry from the list of autocompleted model repositories.
- You may choose a specific branch, or manually enter a commit hash.
Format requirements
What are the format requirements for a model?
What are the format requirements for a model?
- A model should be in safetensors format.
- The model should NOT be nested inside another directory.
- Including other arbitrary files (that are not in the list) is totally fine. However, those files will not be downloaded nor used.
| Required | Filename | Description |
|---|---|---|
| Yes | safetensors | Model weight, e.g. model.safetensors. Use model.safetensors.index.json for split safetensors files |
| Yes | config.json | Model config that includes the architecture. (Supported Models on Friendli) |
| No | tokenizer.json | Tokenizer for the model |
| No | tokenizer_config.json | Tokenizer config. This should be present & have a chat_template field for the Friendli Engine to provide chat APIs |
| No | special_tokens_map.json | Tokenizer’s special tokens to their corresponding token strings |
What are the format requirements for a dataset?
What are the format requirements for a dataset?
The dataset should satisfy the following conditions:
- The dataset must contain a column named “messages”.
- Each row in the “messages” column should be compatible with the chat template of the base model.
For example,
tokenizer_config.jsonofmistralai/Mistral-7B-Instruct-v0.2is a template that repeats the messages of a user and an assistant. Concretely, each row in the “messages” field should follow a format like:[{"role": "user", "content": "The 1st user's message"}, {"role": "assistant", "content": "The 1st assistant's message"}]. In this case,HuggingFaceH4/ultrachat_200kis a dataset that is compatible with the chat template.
Troubleshooting
Inference request errors
Common error codes for inference requests
Common error codes for inference requests
Below is a table of common error codes you might encounter when making inference-related API requests.
| Code | Name | Cause | Suggested Solution |
|---|---|---|---|
400 | Bad Request | The request is malformed or missing required fields. | Check your request payload. Ensure it is valid JSON with all required fields. |
401 | Unauthorized | Missing or invalid API Key. The request lacks proper authentication. | Include a valid Personal API Key in the Authorization header. Verify the key is active and correct. |
403 | Forbidden | The API Key is valid but does not have permission to access the endpoint. | Ensure your Personal API Key has access rights to the endpoint. Use the correct team key or add the X-Friendli-Team header if needed. |
404 | Not Found | The specified endpoint or resource does not exist. This typically occurs when the endpoint_id or team_id is invalid. | Verify the endpoint_id and model name in your request. Ensure they match an existing, non-deleted deployment. Also check for typos in your endpoint ID or team ID. |
422 | Unprocessable Entity | The request is syntactically correct but semantically invalid (e.g. exceeding token limits, invalid parameter values). | Adjust your request (e.g. reduce max_tokens, correct parameter values) and try again. |
429 | Too Many Requests | You have exceeded rate limits for your plan. | Reduce request frequency or upgrade your plan for higher limits. Wait before retrying after a 429 error. |
500 | Internal Server Error | A server-side error occurred while processing the request. | Retry the request after a short delay. If the error persists, check endpoint health in the overview dashboard or contact FriendliAI support. |
Quick checklist before retrying
- Verify the endpoint URL,
endpoint_id, and (if applicable)X-Friendli-Teamheader - Include the
Authorizationheader with a valid key - Confirm the target deployment exists, is healthy, and is not deleted
- Validate request JSON and required fields; reduce
max_tokensif needed - Check rate limits; add retry with backoff when receiving
429
Model selection errors
You don't have access to this gated model
You don't have access to this gated model

The repository / artifact is invalid
The repository / artifact is invalid


The architecture is not supported
The architecture is not supported

Endpoint lifecycle
Why was my endpoint suddenly terminated?
Why was my endpoint suddenly terminated?
Endpoints that remain in a sleep state for 48 hours are automatically terminated.
- When
min_replicas = 0, the endpoint enters a sleep state after the cooldown period if no requests are received. - A notification is sent after 24 hours of sleep, and the endpoint is terminated after another 24 hours if not reactivated.