March 15, 2026
7 min read

Integrating FriendliAI with OpenClaw

Q: What is FriendliAI?

FriendliAI is the Frontier Inference Cloud for Agents, delivering high throughput, low latency, and reliability at scale for agentic workloads. Through vertically optimized inference infrastructure, it delivers 2–5× faster output token speed and a 99.99% uptime SLA for high-volume production traffic.

TL;DR

You can connect FriendliAI Serverless Endpoints to OpenClaw to run agents on high-performance open models.
You can configure multiple model providers, fallback strategies, and specialized agents to handle different types of tasks.
You can optionally connect agents to Discord to route requests from real-world channels.

Integrating FriendliAI with OpenClaw thumbnail

Why OpenClaw + FriendliAI

Modern AI applications are no longer built around a single model call. Instead, they rely on agent systems that coordinate models, tools, and external services to complete complex tasks. These agents often make multiple inference calls while reasoning, planning, and interacting with tools.

When these repeated inference calls rely on proprietary models, costs can grow quickly. At the same time, recent open-source models have significantly improved in capability while remaining much more cost-efficient, making them an increasingly attractive option for building agent systems.

However, running agent systems in practice requires more than just models. Developers also need infrastructure to manage agents, coordinate model calls, maintain conversation state, and integrate with external platforms.

OpenClaw is an open agent framework designed to simplify the development and operation of agent-based systems. It provides infrastructure for:

running persistent AI agents
orchestrating multiple models
managing long-running conversations
integrating with platforms like Discord

While OpenClaw handles the agent orchestration layer, you still need a reliable way to run models efficiently.

This is where FriendliAI comes in. FriendliAI provides high-performance, serverless access to leading open models such as GLM-5, Minimax-M2.5, and other large-scale language models, without requiring you to manage GPU infrastructure.

By combining OpenClaw for orchestration with FriendliAI for model inference, you can build scalable multi-agent systems powered by high-quality open models while keeping operational complexity and costs low.

In this guide, we will walk through integrating FriendliAI models with OpenClaw and building a multi-agent setup that supports multiple models, fallback strategies, and optional external integrations.

What you will build

We will cover:

Basic integration - connect OpenClaw to FriendliAI
Model fallback - improve reliability across providers
Multi-agent setup - assign models to specialized agents
External integrations - connect agents to Discord

By the end of this tutorial, you will have a working OpenClaw setup that runs multiple agents across multiple models using FriendliAI as the inference backend.

Part 1: Basic Integration

Let’s get a single FriendliAI-powered agent running in OpenClaw.

First, create a FriendliAI account and generate an API key.

Sign up at friendli.ai
Navigate to the Friendli Suite dashboard.
Generate a Friendli API Key and copy it. You’ll need it in the next step.

Step 2: Install OpenClaw

Next, install OpenClaw if you haven’t already.

shell

curl -fsSL https://openclaw.bot/install.sh | bash

This script will install OpenClaw and its CLI tools.

Step 3: Run the setup script

Run the following command in your terminal.

The script will automatically:

create ~/.openclaw/openclaw.json if not present
merge with existing configuration if present
Configure FriendliAI as a model provider
Register GLM-5 as an available model

shell

curl -fsSL https://raw.githubusercontent.com/friendliai/examples/refs/heads/main/openclaw-tutorial/setup-friendli.sh | bash -s -- \
--friendli-token $FRIENDLI_API_KEY

Replace $FRIENDLI_API_KEY with your actual FriendliAI API key.

Once the script runs successfully, your OpenClaw configuration will include a FriendliAI provider similar to the following:

json

{
  "models": {
    "mode": "merge",
    "providers": {
      "friendliai": {
        "baseUrl": "https://api.friendli.ai/serverless/v1",
        "apiKey": "<FRIENDLI_TOKEN>",
        "api": "openai-completions",
        "models": [
          {
            "id": "zai-org/GLM-5",
            "name": "GLM-5 (FriendliAI)",
            "reasoning": true,
            "input": ["text"],
            "contextWindow": 203000,
            "maxTokens": 8192
          }                                                            
        ]
      }
    }
  }
}

At this point, OpenClaw will recognize FriendliAI as a model provider.

Step 4: Run OpenClaw onboarding

Run the following command to finalize the installation. This command starts the onboarding flow, guiding you through the initial setup process such as configuring required settings and preparing your environment to use OpenClaw.

The --install-daemon flag installs the OpenClaw background service, allowing agents to run continuously and respond to requests from external integrations like Discord.

shell

openclaw onboard --install-daemon

You can use the Quickstart onboarding mode.

During onboarding:

OpenClaw will auto-detect the FriendliAI provider
Skip the Model/Auth provider step
Choose “All Providers” for the model filter
Select “Keep Current” for the default model

Once onboarding is complete, run the following command to verify that your configuration is valid and that OpenClaw is ready to run using FriendliAI-hosted models.

shell

openclaw doctor

Your OpenClaw environment is now connected to FriendliAI Serverless Endpoints, allowing your agents to run on models such as GLM-5.

You can now start interacting with your agent:

shell

openclaw tui

After running this command, you will be able to interact with OpenClaw directly in your terminal like this:

See the following guides to learn more:

Get Started: https://docs.openclaw.ai/
Installation: https://docs.openclaw.ai/install

Part 2: Advanced Configuration

The basic setup is enough for simple agent workflows, but OpenClaw becomes much more powerful when you combine multiple models and agents.

For example, using multiple models enables fallback when one model fails or performs poorly, while multi-agent setups allow different agents to handle specialized tasks across various use cases.

In this section, we will explore:

Configuring multiple providers and models
Setting up a fallback model
Configuring multiple agents
Binding agents to external applications

These features allow you to build more robust AI systems by combining different models for different tasks.

Configuring Multiple Model Providers

To increase flexibility, you can register multiple model providers in OpenClaw. This allows you to access models from different providers and use them across various workflows and agents. In practice, teams often use multiple providers to improve reliability (e.g., switching providers when rate limits occur), access different model capabilities, or balance cost and performance.

The first step is to configure multiple providers in the models.providers section of the OpenClaw configuration.

In our setup, we register both FriendliAI and Anthropic as model providers.

Table
Provider	Models
FriendliAI	GLM-5, Llama-3.3-70B-Instruct
Anthropic	Claude Opus 4.6

And this configuration allows OpenClaw to have access to multiple providers.

json

"models": {
  "mode": "merge",
  "providers": {
    "friendliai": {
      "baseUrl": "https://api.friendli.ai/serverless/v1",
      "apiKey": "<FRIENDLI_TOKEN>",
      "api": "openai-completions",
      "models": [
        {
          "id": "zai-org/GLM-5",
          "name": "zai-org/GLM-5 (FriendliAI)"
        },
        {
          "id": "meta-llama-3.3-70b-instruct",
          "name": "meta-llama-3.3-70b-instruct (FriendliAI)"
        }
      ]
    },
    "anthropic": {
      "baseUrl": "https://api.anthropic.com/v1",
      "apiKey": "<ANTHROPIC_TOKEN>",
      "api": "anthropic-messages",
      "models": [
        {
          "id": "claude-opus-4-6",
          "name": "claude-opus-4-6 (Anthropic)"
        }
      ]
    }
  }
}

Model Fallback Strategy

One common production pattern is configuring fallback models. In real-world systems, model requests can fail for various reasons such as, rate limits, provider outages, network errors, or exceeding usage or budget limits. When this happens, relying on a single model provider can interrupt workflows or cause requests to fail entirely.

A fallback strategy helps prevent these interruptions by automatically routing the request to another available model if the primary model cannot handle it. This ensures that tasks can continue running even when the preferred model is unavailable.

In the configuration below, we define Claude Opus 4.6 as the primary model and GLM-5 on FriendliAI as a fallback.

json

{
  "id": "main",
  "workspace": "$HOME/.openclaw/workspace-main",
  "model": {
    "primary": "anthropic/claude-opus-4-6",
    "fallbacks": [
      "friendliai/zai-org/GLM-5"
    ]
  }
}

This means:

OpenClaw will first attempt to run the task using Claude Opus
If that fails, it will automatically fall back to GLM-5 hosted on FriendliAI

Creating Multiple Specialized Agents

OpenClaw allows you to define multiple agents in the agents.list section.

Each agent can run on a different model depending on the type of task.

In our configuration, we define two agents:

main - reasoning tasks
fast - low latency responses

json

"agents": {
  "list": [
    {
      "id": "main",
      "workspace": "$HOME/.openclaw/workspace-main",
      "model": {
        "primary": "anthropic/claude-opus-4-6",
        "fallbacks": [
          "friendliai/zai-org/GLM-5"
        ]
      }
    },
    {
      "id": "fast",
      "workspace": "$HOME/.openclaw/workspace-fast",
      "model": {
        "primary": "friendliai/meta-llama-3.3-70b-instruct"
      }
    }
  ]
}

This configuration separates workloads across different models:

Table
Agent	Model	Purpose
main	Claude Opus 4.6 -> GLM-5 fallback	Complex reasoning
fast	Llama-3.3-70B-Instruct	Low-latency response

This pattern is useful because different models excel at different tasks.

Now you can test these agents in the terminal using the same command mentioned above.

shell

openclaw tui

By typing /agent, you can view and switch between the agents we configured. You’ll then be able to interact with those agents running on Friendli models.

Connecting Discord Channels (Optional)

Note: The following steps are optional and go beyond the core integration between FriendliAI and OpenClaw. It demonstrates how to connect OpenClaw agents to an external application such as Discord.

In this example, we will connect OpenClaw to Discord by configuring Discord bots.

In the configuration below, we define two Discord bot accounts that correspond to different OpenClaw agents (main and fast). Each bot listens to a specific Discord server and channel, allowing requests from different channels to be routed to different agents.

To configure this integration, you will need the following information from Discord:

DISCORD_BOT_TOKEN – the authentication token for your Discord bot
DISCORD_SERVER_ID – the ID of the Discord server (guild)
DISCORD_CHANNEL_ID – the ID of the channel where the bot should respond

json

"channels": {
    "discord": {
      "enabled": true,
      "groupPolicy": "allowlist",
      "accounts": {
        "everyday-chat": {
          "token": "<DISCORD_BOT_TOKEN>",
          "groupPolicy": "allowlist",
          "guilds": {
            "<DISCORD_SERVER_ID>": {
              "channels": {
                "<DISCORD_CHANNEL_ID>": {
                  "allow": true,
                  "requireMention": false
                }
              }
            }
          },
          "streaming": "off"
        },
        "deep-research": {
          "token": "<DISCORD_BOT_TOKEN>",
          "groupPolicy": "allowlist",
          "guilds": {
            "<DISCORD_SERVER_ID>": {
              "channels": {
                "<DISCORD_CHANNEL_ID>": {
                  "allow": true,
                  "requireMention": false
                }
              }
            }
          },
          "streaming": "off"
        }
      },
      "streaming": "off"
    }
  }

Each account in the accounts section represents a Discord bot configuration.

It specifies which Discord server and channel the bot listens to, allowing OpenClaw to route messages from that channel to the corresponding agent.

For more details, refer to the OpenClaw documentation on Discord integration. You may also need to complete additional configuration in the Discord Developer Portal.

Once Discord is connected, you can route requests to different agents based on the channel they come from.

Routing Agents by Channel (Optional)

As we’ve seen so far, we now have multiple agents and channels. In this section, we will bind them together.

For example, requests from different Discord channels can be routed to different agents and models. A general Q&A channel can use a faster, lower-cost model, while a development or coding channel can use a more powerful model. This helps balance speed, cost, and capability across different workflows.

This is configured in the bindings section.

json

"bindings": [
  {
    "agentId": "fast",
    "match": {
      "channel": "discord",
      "accountId": "everyday-chat"
    }
  },
  {
    "agentId": "main",
    "match": {
      "channel": "discord",
      "accountId": "deep-research"
    }
  }
]

This routing configuration means:

Table
Channel	Agent
everyday-chat	fast
deep-research	main

So in practice:

everyday-chat → handled by the fast agent (Llama-3.3-70B-Instruct)
deep-research → handled by the main agent (Opus4.6 + GLM-5 fallback)

This allows you to create task-specific environments within the same system.

Putting It All Together

With this configuration in place, OpenClaw now supports:

Multiple model providers
Cross-provider fallback
Specialized agents
Channel-based routing
Discord integration

This architecture allows you to build more flexible AI systems where different models handle different workloads.

Final Thoughts

By combining FriendliAI’s hosted models with OpenClaw’s agent orchestration, you can build scalable multi-model agent systems with minimal configuration.

From here, you can continue expanding your system with additional models, agents, and integrations depending on your workload.

👉 Learn more at FriendliAI

👉 Sign up for Friendli Suite and get instant access

Written by

FriendliAI Tech & Research

General FAQ

What is FriendliAI?

FriendliAI is the Frontier Inference Cloud for Agents, delivering high throughput, low latency, and reliability at scale for agentic workloads. Through vertically optimized inference infrastructure, it delivers 2–5× faster output token speed and a 99.99% uptime SLA for high-volume production traffic.

How does FriendliAI reduce inference costs?

FriendliAI reduces inference costs through higher GPU utilization and optimized inference performance. FriendliAI's patented continuous batching technique, along with quantization, speculative decoding, KV cache offloading, multi-LoRA serving, and autoscaling, helps you serve more tokens with fewer GPUs, lowering your infrastructure costs without sacrificing performance.

Why should I choose FriendliAI over other inference providers?

FriendliAI is built for production AI agents, combining speed, reliability, and efficiency at scale. It delivers low-latency streaming, reliable long-context inference, and robust tool calling without compromising stability. According to independent OpenRouter benchmarks, FriendliAI consistently ranks among the top providers for throughput, latency, and reliability across leading open-weight models. See why customers choose FriendliAI

Which open-weight models does FriendliAI support?

Run today’s frontier open-weight models—including GLM, MiniMax, Kimi, DeepSeek, Qwen, Gemma, and more—with a simple API call. FriendliAI Model API gives you instant access to the latest models with optimized inference performance for production workloads. Explore models and pricing

How do I get started?

Getting started takes just a few minutes. [1] Sign up for FriendliAI, [2] Generate your API key, and [3] Make your first inference request with frontier open-weight models.

Still have questions?

If you want a customized solution for that key issue that is slowing your growth, support@friendli.ai or click Talk to an engineer — our engineers (not a bot) will reply within one business day.