Customer Stories
FriendliAI brings frontier AI to production — powering systems that handle billions of interactions, real-time intelligence, and massive-scale inference.
LG AI Research deploys K-EXAONE with 3× traffic growth
LG AI Research used Friendli Dedicated and Serverless Endpoints to take their 32B EXAONE and 236B K-EXAONE models from research to production — cutting deployment time by several weeks and tripling traffic.


SK Telecom achieves 5× LLM throughput and 3× cost savings
South Korea's largest telecom operator needed to serve enterprise AI agents for millions of users under strict SLAs. Powered by FriendliAI, SKT went from evaluation to full production in hours, cutting operational costs by 3× and boosting throughput 5× without adding infrastructure complexity.
NextDay AI cuts GPU costs by 50% and boosts throughput 2–3×
NextDay AI needed better inference economics without changing hardware. By deploying Friendli Container with optimized 8-bit quantization and iteration batching, NextDay AI delivered more capacity from the same infrastructure.

What Our Customers Are Saying
Our custom model API went live in about a day with enterprise-grade monitoring built in.
Rock-solid reliability with ultra-low tail latency.
Scale to trillions of tokens with 50% fewer GPUs, thanks to FriendliAI.
Fluctuating traffic is no longer a concern because autoscaling just works.
Friendli Engine is an irreplaceable solution for generative AI serving.
Our custom model API went live in about a day with enterprise-grade monitoring built in.
Rock-solid reliability with ultra-low tail latency.
Scale to trillions of tokens with 50% fewer GPUs, thanks to FriendliAI.
Fluctuating traffic is no longer a concern because autoscaling just works.
Friendli Engine is an irreplaceable solution for generative AI serving.






