November 1, 2022
1 min read

Save on Training Costs of Generative AI with Friendli Training

Generative AI is already widely used for chatbots, translation, code generation, summarization, image generation, and much more. Thanks to recent advances in generative AI, it can now generate high-quality texts and images. A report from Sequoia Capital says “Just as mobile unleashed new types of applications …, we expect these large models to motivate a new wave of generative AI applications.”¹ A notable example of generative AI is GPT-3², a pre-trained language model for diverse text generation tasks.

We recently had a chance to compare Friendli Training with Microsoft DeepSpeed on training a 16B GPT-3 model.

To train the model, we used 16 VMs, each of which hosts 8 NVIDIA A100 40GB GPUs. In total, we used 128 A100 GPUs and ran 150K steps to train the model.

Friendli Training speeds up training by 3.5x compared to Microsoft DeepSpeed on one of the top 3 public clouds thanks to its engine-cloud co-optimization. Friendli Training chooses the best time-cost tradeoff based on the chosen model and cloud, supporting AWS, Azure, and GCP. Save time and reduce costs significantly with Friendli Training. Try out Friendli Training now to train your own generative AI models!

[1] https://www.sequoiacap.com/article/generative-ai-a-creative-new-world/

[2] https://arxiv.org/abs/2005.14165

Written by

FriendliAI Tech & Research

General FAQ

What is FriendliAI?

FriendliAI is a GPU-inference platform that lets you deploy, scale, and monitor large language and multimodal models in production, without owning or managing GPU infrastructure. We offer three things for your AI models: Unmatched speed, cost efficiency, and operational simplicity. Find out which product is the best fit for you in here.

How does FriendliAI help my business?

Our Friendli Inference allows you to squeeze more tokens-per-second out of every GPU. Because you need fewer GPUs to serve the same load, the true metric—tokens per dollar—comes out higher even if the hourly GPU rate looks similar on paper. View pricing

Which models and modalities are supported?

Over 380,000 text, vision, audio, and multi-modal models are deployable out of the box. You can also upload custom models or LoRA adapters. Explore models

Can I deploy models from Hugging Face directly?

Yes. A one-click deploy by selecting “Friendli Endpoints” on the Hugging Face Hub will take you to our model deployment page. The page provides an easy-to-use interface for setting up Friendli Dedicated Endpoints, a managed service for generative AI inference. Learn more about our Hugging Face partnership

Still have questions?

If you want a customized solution for that key issue that is slowing your growth, contact@friendli.ai or click Talk to an engineer — our engineers (not a bot) will reply within one business day.

January 17, 2023
3 min read

Fine-tuning and Serving CodeGen, a Code Generation Model, with Friendli Inference

Tutorial

Models

AI agents

October 8, 2022
2 min read

Serve generative AI models like T5 faster than ever with Friendli Inference (32.8x faster for T5–3B)

Save on Training Costs of Generative AI with Friendli Training

General FAQ

Related Posts

Fine-tuning and Serving CodeGen, a Code Generation Model, with Friendli Inference

Serve generative AI models like T5 faster than ever with Friendli Inference (32.8x faster for T5–3B)

Explore FriendliAI today