Blog

Accelerating LLM Training with Memory-Balanced Pipeline Parallelism
We are delighted to announce that FriendliAI’s work has been accepted and selected for an oral presentation in ICML ‘23!...

PeriFlow’s Enriched Coverage for Sought-After LLMs: MPT, LLaMA, and Dolly
We have some exciting news to share! As you probably know, FriendliAI’s PeriFlow supports various LLMs, including GPT and T5. We further...

Get an Extra Speedup of LLM Inference with Integer Quantization on PeriFlow
At FriendliAI, our top priority is to deliver a serving system with the best performance. We are excited to introduce...
Fine-tuning and Serving CodeGen, a Code Generation Model, with PeriFlow
CodeGen, unveiled in 2022 by Salesforce, is a language model that allows users to create programs with natural language instead...

Save on Training Costs of Generative AI with PeriFlow
Generative AI is already widely used for chatbots, translation, code generation, summarization, image generation, and much more. Thanks to recent...

Serve generative AI models like T5 faster than ever with PeriFlow (32.8x faster for T5–3B)
In our previous blog articles (#1, #2), we showed the performance gain of PeriFlow (aka Orca) on GPT3, a popular...

PeriFlow: How Good is it on Small Models?
We showed the dramatic performance gain (cost saving) of PeriFlow (aka Orca) running large-scale generative models like GPT 175B, thanks...

PeriFlow: How to Serve Large-scale Transformer Models
Transformer models have recently been transforming the landscape in deep learning, particularly in natural language processing, thanks to their excellence...

Introducing GPT-FAI 13B: A Large-scale Language Model Trained with FriendliAI’s PeriFlow
We are happy to announce that we are releasing GPT-FAI 13B, a large-scale language model trained with FriendliAI’s PeriFlow. GPT-FAI...