Skip to main content

Documentation Index

Fetch the complete documentation index at: https://friendli.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

Introduction

This guide explores the steps to serve Mixture of Experts (MoE) models such as Mixtral 8x7B using Friendli Container.

Search optimal policy and run Friendli Container

To serve MoE models efficiently, you need to run a policy search to find the optimal execution policy. Learn how to run the policy search at Running Policy Search. Once the search finds an optimal policy, it compiles the policy into a file that you can use to create serving endpoints. The engine then serves the endpoint using the optimal policy.
Last modified on April 13, 2026