Skip to main content

Overview

Friendli Serverless Endpoints let you run state-of-the-art AI models instantly, without provisioning infrastructure, managing deployments, or configuring runtime settings. You can experiment interactively in the UI or integrate models directly into your application using simple API calls. This quickstart walks you through both options. The fastest way to get started is by trying models directly in Friendli Suite. The playground provides an interactive, chat-based experience where you can test prompts, inspect responses, and fine-tune inference settings.

1. Sign up or Log in

Create an account or log in at https://friendli.ai/suite. sign-in

2. Navigate to the Serverless Endpoints page

Simply open the Serverless page in Friendli Suite and select any model from the list of available options, including GLM-4.6 and other leading LLMs, embedding models, and vision models. suite-overview

3. Choose a Model

Browse the list and select the model you want to use. Each model page provides a brief overview along with usage details. serverless-endpoints serverless-overview

4. Experiment in the Playground

Once a model is selected, you can immediately start interacting with it in the Playground. The Playground offers a chat-style interface designed for rapid experimentation, with built-in tools such as a calculator, Python interpreter, and Linkup web search. playground chat You can also adjust inference parameters, such as temperature and top-p, to fine-tune the model’s behavior and output. parameters

Access Models Programmatically with the Serverless API

When you’re ready to integrate a model into your application, you can start sending API requests right away. Each model page includes ready-to-use example code, making it easy to copy, paste, and adapt to your workflow.

1. Sign up or Log in

Create an account or log in at https://friendli.ai/suite. sign-in

2. Create an API token

You can create and manage API tokens in: Suite -> Settings -> API Tokens. api-token Set your key as an environment variable:
export FRIENDLI_TOKEN="YOUR API KEY HERE"

3. Install Friendli Python SDK or any OpenAI-compatible SDK

# uv
uv add friendli

# pip
pip install friendli

4. Send API requests

You can now start sending API requests right away.
import os

from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("FRIENDLI_TOKEN"),
    base_url="https://api.friendli.ai/serverless/v1",
)

completion = client.chat.completions.create(
    model="zai-org/GLM-4.6",
    extra_body={
        "parse_reasoning": True,
        "chat_template_kwargs": {"enable_thinking": True},
    },
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"},
    ],
)

print("Reasoning: ", completion.choices[0].message.reasoning_content)
print(completion.choices[0].message.content)