Customer Stories

FriendliAI brings frontier AI to production — powering systems that handle billions of interactions, real-time intelligence, and massive-scale inference.

Talk to an engineer

LG AI Research deploys K-EXAONE with traffic growth

LG AI Research used Friendli Dedicated and Serverless Endpoints to take their 32B EXAONE and 236B K-EXAONE models from research to production — cutting deployment time by several weeks and tripling traffic.

Dedicated EndpointsServerless Endpoints
Read Case Study
LG AI Research
SK Telecom

SK Telecom achieves 5× LLM throughput and 3× cost savings

South Korea's largest telecom operator needed to serve enterprise AI agents for millions of users under strict SLAs. Powered by FriendliAI, SKT went from evaluation to full production in hours, cutting operational costs by 3× and boosting throughput 5× without adding infrastructure complexity.

Dedicated Endpoints
Read Case Study

NextDay AI cuts GPU costs by 50% and boosts throughput 2–3×

NextDay AI needed better inference economics without changing hardware. By deploying Friendli Container with optimized 8-bit quantization and iteration batching, NextDay AI delivered more capacity from the same infrastructure.

Friendli Container
Read Case Study
NextDay AI

What Our Customers Are Saying

Our custom model API went live in about a day with enterprise-grade monitoring built in.

Rock-solid reliability with ultra-low tail latency.

Scale to trillions of tokens with 50% fewer GPUs, thanks to FriendliAI.

Fluctuating traffic is no longer a concern because autoscaling just works.

Friendli Engine is an irreplaceable solution for generative AI serving.

Explore FriendliAI today