- April 8, 2026
- 6 min read
Running OpenClaw with NemoClaw and FriendliAI
- OpenClaw enables powerful autonomous agents, but introduces new security and control challenges.
- NemoClaw adds a sandboxed, policy-driven runtime that isolates agents and manages external interactions safely.
- FriendliAI fits into this stack as the inference layer, providing fast and scalable open-weight model execution without changing agent code.

Running OpenClaw with NemoClaw and FriendliAI
OpenClaw is an open-source agent framework built for always-on AI assistants, ones that operate across messaging platforms, interact with files, and execute real tasks autonomously. Unlike traditional chat interfaces that wait for user input, OpenClaw agents run continuously and take actions in the world.
That autonomy is powerful, but it also creates real security challenges. Agents that can call external APIs, handle credentials, and run code indefinitely are difficult to control. Prompt injection, malicious tool use, and unintended side effects are no longer theoretical when agents interact with live systems around the clock. NemoClaw is NVIDIA’s answer: a controlled runtime that wraps OpenClaw in a sandboxed, policy-enforced environment without changing how the agent itself behaves.
At the same time, many teams are increasingly turning to open models, such as GLM-5 or Minimax M2.5, when building agent systems. While cost is often the primary driver, open models also provide more control over behavior and deployment, which becomes important when agents run continuously and interact with external systems.
This post walks through how NemoClaw is structured and, more importantly, how FriendliAI plugs into that stack as the inference layer powering open model execution, with optimized performance and zero changes to your agent code.
Why NemoClaw Matters for OpenClaw
OpenClaw’s strength is precisely what makes it hard to run safely. Agents that operate autonomously, calling tools, making API requests, executing code, do so without a human in the loop. That creates surface area that traditional request-response systems never had to worry about.
This introduces a few practical concerns:
- Uncontrolled network access. OpenClaw agents can call any external API or internal endpoint without restriction. In practice, this means an agent handling a routine task could trigger downstream services, exfiltrate data, or rack up unexpected costs, all without explicit user intent.
- Credential exposure risks. When API keys are passed into or stored inside the agent runtime, they become accessible to any tool the agent uses, including malicious ones injected through prompt injection attacks. A compromised agent context means compromised credentials.
- Long-running execution risks. Agents that never stop are harder to audit and contain. State can accumulate, context can drift, and a single compromised action early in a session can affect everything that follows, with no natural checkpoint to catch it.
NemoClaw addresses these concerns by wrapping OpenClaw in a more controlled runtime environment.
At its core, NemoClaw is a system that runs OpenClaw agents inside a sandboxed and policy-controlled environment, while routing model access and external interactions through a managed layer.
Importantly, NemoClaw does not replace OpenClaw. It is an open source reference stack designed to make OpenClaw safer to run in real environments.

The diagram above shows how NemoClaw manages the lifecycle of an OpenClaw agent.
On the host side, NemoClaw starts with an onboarding process that sets up the environment using a predefined blueprint. This blueprint acts as a configuration plan that defines how the agent should run, including sandbox settings, inference routing, and security policies.
Once initialized, the OpenClaw agent runs inside an OpenShell sandbox, where its execution is isolated from the host system. Within this environment:
- Model inference is routed through a managed layer instead of direct external calls
- Network access is governed by default policies
- Filesystem access is restricted to a controlled scope
In other words, NemoClaw does not change how the agent behaves. It changes how and where the agent runs, introducing a controlled layer between the agent and the outside world.
This creates a clear boundary between agent logic and external resources. Instead of allowing direct access to models or services, interactions are mediated through policy and routing layers. This enables safer execution by enforcing boundaries on network access and credentials, while still allowing agents to use tools and models through a controlled interface.
Learn more about NemoClaw’s Architecture:
https://docs.nvidia.com/nemoclaw/latest/reference/architecture.html
Where FriendliAI Fits In
Under NemoClaw, inference requests from the agent never leave the sandbox directly. OpenShell intercepts them, manages credentials externally, and forwards requests to the configured upstream provider. This is where FriendliAI comes in. It is not a generic placeholder, but an inference layer purpose-built for demanding workloads: low latency, high throughput, and support for a broad catalog of open models, all through an OpenAI-compatible API that requires no changes to your agent code.

This routing layer is the key integration point between FriendliAI and NemoClaw.
Inside a NemoClaw setup:
- The OpenClaw agent sends requests to a local endpoint such as
inference.local, which is managed by OpenShell - OpenShell forwards these requests to the configured upstream provider, using credentials such as API keys that are securely stored and managed outside the agent runtime.
- The upstream provider processes the request and returns the response.
Configuring FriendliAI as the provider means you don't have to choose between security and performance. NemoClaw controls what the agent can access, while FriendliAI ensures efficient and reliable inference.
Integrating FriendliAI with NemoClaw
In this section, we will walk through configuring FriendliAI as the inference backend in a NemoClaw environment.
1. Create a FriendliAI API Key
First, create a FriendliAI account and generate an API key.
- Sign up at friendli.ai
- Navigate to the Friendli Suite dashboard.
- Generate a FRIENDLI_API_KEY and copy it.
You will use this token when registering FriendliAI as an inference provider.
2. Install NemoClaw
If you have not installed NemoClaw yet, run the following command:
This installs NemoClaw along with its CLI tools.
After installation, you will be guided through the onboarding wizard.
During onboarding:
- You can either select a provider directly or follow the default setup
- You will be asked to define a sandbox name
3. Register Friendli as an Inference Provider
Now, configure OpenShell to use FriendliAI as the upstream inference provider.
Then set the model:
You can verify the configuration with:
At this point, OpenShell will route all inference requests to FriendliAI.
4. Verify the Configuration Inside the Sandbox
Now connect to your sandbox:
Once inside, check the OpenClaw configuration:
You will see something like:
This is an important detail. You will notice that the baseUrl is set to “https://inference.local/v1”, which is the OpenShell endpoint. OpenClaw sends requests to this local endpoint, and OpenShell then forwards them to FriendliAI.
This abstraction enables secure credential handling by keeping API keys outside the agent runtime, while routing all requests through a centralized layer.
5. Run OpenClaw
You are now ready to start your agent:
This launches an interactive interface directly in your terminal, where you can run an agent enhanced with NemoClaw’s security, routing, and sandboxing capabilities.
Notes
The OpenClaw UI and the actual inference layer operate independently by design. The UI reflects the local configuration in `openclaw.json` inside the sandbox, so even if you change the provider or model using openshell inference set, the UI may still display the previous model name. OpenShell handles the actual routing behind the scenes. If you want the UI to reflect the current model, update the model entry in `openclaw.json` manually. Inference itself is always routed correctly regardless of what the UI shows.
Putting It All Together
NemoClaw solves a real problem: how to run autonomous agents in production without giving them unchecked access to your systems and credentials. FriendliAI fits precisely into that architecture as the inference layer, handling open model execution with the performance that agent workloads demand.
The integration is simple. Whether you're running a proof of concept today or preparing for a production deployment, the same configuration scales with you.
If you are interested in exploring FriendliAI further:
- Serverless models we support: https://friendli.ai/model?products=SERVERLESS
- Integration with OpenClaw: https://friendli.ai/blog/integrating-friendliai-with-openclaw
Written by
FriendliAI Tech & Research
Share
General FAQ
What is FriendliAI?
FriendliAI is a GPU-inference platform that lets you deploy, scale, and monitor large language and multimodal models in production, without owning or managing GPU infrastructure. We offer three things for your AI models: Unmatched speed, cost efficiency, and operational simplicity. Find out which product is the best fit for you in here.
How does FriendliAI help my business?
Our Friendli Inference allows you to squeeze more tokens-per-second out of every GPU. Because you need fewer GPUs to serve the same load, the true metric—tokens per dollar—comes out higher even if the hourly GPU rate looks similar on paper. View pricing
Which models and modalities are supported?
Over 530,000 text, vision, audio, and multi-modal models are deployable out of the box. You can also upload custom models or LoRA adapters. Explore models
Can I deploy models from Hugging Face directly?
Yes. A one-click deploy by selecting “Friendli Endpoints” on the Hugging Face Hub will take you to our model deployment page. The page provides an easy-to-use interface for setting up Friendli Dedicated Endpoints, a managed service for generative AI inference. Learn more about our Hugging Face partnership
Still have questions?
If you want a customized solution for that key issue that is slowing your growth, contact@friendli.ai or click Talk to an engineer — our engineers (not a bot) will reply within one business day.

