QuickStart: Friendli Serverless Endpoints

Overview

Friendli Serverless Endpoints let you run state-of-the-art AI models instantly, without provisioning infrastructure, managing deployments, or configuring runtime settings. You can experiment interactively in the UI or integrate models directly into your application using simple API calls. This quickstart walks you through both options.

Explore Popular AI Models in a Chat-Style Playground

The fastest way to get started is by trying models directly in Friendli Suite. The playground provides an interactive, chat-based experience where you can test prompts, inspect responses, and fine-tune inference settings. Create an account or log in at https://friendli.ai/suite.

2. Navigate to the Serverless Endpoints page

Simply open the Serverless page in Friendli Suite and select any model from the list of available options, including GLM-5 and other leading LLMs, embedding models, and vision models.

3. Choose a Model

Browse the list and select the model you want to use. Each model page provides a brief overview along with usage details.

4. Experiment in the Playground

Once a model is selected, you can immediately start interacting with it in the Playground. The Playground offers a chat-style interface designed for rapid experimentation, with built-in tools such as a calculator, Python interpreter, and Linkup web search.

You can also adjust inference parameters, such as temperature and top-p, to fine-tune the model’s behavior and output.

Access Models Programmatically with the Serverless API

When you’re ready to integrate a model into your application, you can start sending API requests right away. Each model page includes ready-to-use example code, making it easy to copy, paste, and adapt to your workflow. Create an account or log in at https://friendli.ai/suite.

2. Create an API token

You can create and manage API tokens in: Suite -> Settings -> API Tokens.

Set your key as an environment variable:

export FRIENDLI_TOKEN="YOUR API KEY HERE"

3. Install Friendli Python SDK or any OpenAI-compatible SDK

# uv
uv add friendli

# pip
pip install friendli

4. Send API requests

You can now start sending API requests right away.

import os

from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("FRIENDLI_TOKEN"),
    base_url="https://api.friendli.ai/serverless/v1",
)

completion = client.chat.completions.create(
    model="zai-org/GLM-5",
    extra_body={
        "parse_reasoning": True,
        "chat_template_kwargs": {"enable_thinking": True},
    },
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"},
    ],
)

print("Reasoning: ", completion.choices[0].message.reasoning_content)
print(completion.choices[0].message.content)

Get Started

Capabilities

Friendli Dedicated Endpoints

Friendli Serverless Endpoints

Friendli Container

QuickStart: Friendli Serverless Endpoints

Overview

Explore Popular AI Models in a Chat-Style Playground

2. Navigate to the Serverless Endpoints page

3. Choose a Model

4. Experiment in the Playground

Access Models Programmatically with the Serverless API

2. Create an API token

3. Install Friendli Python SDK or any OpenAI-compatible SDK

4. Send API requests

Get Started

Capabilities

Friendli Dedicated Endpoints

Friendli Serverless Endpoints

Friendli Container

​Overview

​Explore Popular AI Models in a Chat-Style Playground

​1. Sign up or Log in

​2. Navigate to the Serverless Endpoints page

​3. Choose a Model

​4. Experiment in the Playground

​Access Models Programmatically with the Serverless API

​1. Sign up or Log in

​2. Create an API token

​3. Install Friendli Python SDK or any OpenAI-compatible SDK

​4. Send API requests

Overview

Explore Popular AI Models in a Chat-Style Playground

1. Sign up or Log in

2. Navigate to the Serverless Endpoints page

3. Choose a Model

4. Experiment in the Playground

Access Models Programmatically with the Serverless API

1. Sign up or Log in

2. Create an API token

3. Install Friendli Python SDK or any OpenAI-compatible SDK

4. Send API requests