- January 22, 2025
- 3 min read
Deploy Models from Hugging Face to Friendli Endpoints
In this blog, we announce our new strategic partnership with Hugging Face, the leading platform to host and collaborate on AI models, datasets, and applications. FriendliAI’s cutting-edge inference infrastructure is now integrated into the Hugging Face Hub, simplifying and accelerating generative AI model inference serving.
The new integration introduces FriendliAI Endpoints as a deployment option within the Hugging Face Hub, offering developers direct access to high-performance, cost-effective inference infrastructure. By combining Hugging Face’s user-friendly platform with FriendliAI’s GPU-optimized inference technology, we’re empowering developers to unlock the full potential of generative AI while minimizing operational complexities and costs.
Simplifying Model Deployment on Hugging Face Hub
Last year, we introduced integration with Hugging Face, allowing users to seamlessly deploy Hugging Face models directly within the Friendli Suite platform. Through this integration, users have gained access to thousands of supported open-source models on Hugging Face, as well as the capability to deploy private models effortlessly. Building on this success, we are taking the integration further by enabling 1-click deployment directly within the Hugging Face Hub.
Selecting “Friendli Endpoints” on the Hugging Face Hub will take you to our model deployment page. The page provides an easy-to-use interface for setting up Friendli Dedicated Endpoints, a managed service for generative AI inference. Additionally, while your deployment is processing, you can chat with open-source models directly on the page, exploring and testing their capabilities before production. Click this link to head to the Hugging Face site and deploy the Llama 3.3 70B Instruct model directly.
Alternatively, you can deploy dedicated endpoints for private Hugging Face models by searching for the models on the FriendliAI platform.
Deploy models with NVIDIA H100 in Friendli Dedicated Endpoints
With our advanced GPU-optimized inference engine, Dedicated Endpoints delivers fast and cost-effective inference as a managed service. Developers can effortlessly deploy open-source or custom models on NVIDIA H100 GPUs using Friendli Dedicated Endpoints by clicking “Deploy now” on the model deployment page.
For developers deploying custom or private Hugging Face models, Friendli Dedicated Endpoints offers a managed service optimized for NVIDIA H100 GPUs–powerful but expensive to operate at scale. With just a click of “Deploy now” on the model deployment page, you can effortlessly deploy your models. Friendli Dedicated Endpoints deliver:
- Fast and Cost-Effective Inference: FriendliAI’s optimization reduces the number of GPUs required while maintaining peak performance, significantly lowering costs.
- Simplified Infrastructure Management: Focus on innovation while FriendliAI handles the complexities of scaling and managing infrastructure.
Inference Open-Source Models with Friendli Serverless Endpoints
For developers looking to efficiently inference open-source models, Friendli Serverless Endpoints is an ideal solution. They offer:
- User-Friendly APIs: Simplify interactions with open-source models through APIs optimized by FriendliAI.
- High Performance at Low Cost: Ensure efficient inference with minimal expense.
- Interactive Experience: Chat directly with powerful open-source models on the model deployment page, making it easy to explore and test their capabilities.
You can chat with these powerful open-source models directly on the model deployment page.
Driving the Future of AI Together
Our partnership marks a significant milestone in our mission to make AI more accessible, efficient, and impactful. By integrating FriendliAI’s advanced inference technology into the Hugging Face Hub, we are simplifying the deployment process and enabling developers to focus on what matters most: innovation.
We are thrilled to deepen our collaboration with Hugging Face and look forward to empowering the global AI community with tools that drive groundbreaking advancements. Together, we are shaping the next era of AI development. Read more about our partnership on the Hugging Face blog and start deploying models today on the Hugging Face Hub.
Written by
FriendliAI Tech & Research
Share