Three ways to run generative AI models with Friendli Engine:
02
Friendli Dedicated Endpoints
Build and serve custom LLMs on autopilot with Friendli Dedicated Endpoints
Learn moreReducing LLM serving costs for a novel writing service
Friendli Container helped NaCloud reduce the cost of serving LLMs.
Problem
Operating a writing service powered by LLMs
Solution
Uses Friendli Container for LLM serving
Result
Cuts LLM serving cost instantly
Upstage LLMs with Friendli Dedicated Endpoints
Upstage’s Solar LLMs are operated cost-efficiently without any operation burden, thanks to Friendli Dedicated Endpoints.
Problem
Operated LLMs cost-efficiently under varying input traffic
Solution
Uses Friendli Dedicated Endpoints for running LLMs
Result
Cost-efficient LLM offering without any operational burden
Chatbot Company A
Chatbot Company A
LLM-powered chatbot company A cuts GPU costs by more than 50% instantly.
Problem
Processing ~0.5 trillion tokens per month incurs high H100 GPU costs
Solution
Uses Friendli Container for LLM serving
Result
Cuts costs by more than 50% instantly
Integration of Friendli Engine with Amazon Sagemaker Jumpstart
Amazon Sagemaker Jumpstart users can run VARCO LLMs with Friendli Engine.
Problem
Serving JumpStart Foundation Models incurs performance and cost challenges
Solution
Friendli Engine has been integrated with Amazon Sagemaker Jumpstart to serve JumpStart Foundation Models
Result
Harness the power of Friendli Engine to serve JumpStart Foundation Models
TUNiB’s emotional chatbots with Friendli Dedicated Endpoints
TUNiB’s chatbot services operate smoothly with Friendli Dedicated Endpoints.
Problem
Managing chatbot LLMs incurs significant engineering effort
Solution
Uses Friendli Dedicated Endpoints for the models
Result
Convenient, reliable, and cost-efficient service without the need for self-management
Zeta blooms with Friendli Container
Scatter Lab’s chatbot serves their users with Friendli Engine.
Problem
The generative model is expensive to run
Solution
Uses Friendli Container for Zeta 2.0
Result
Cuts costs by 50%