August 19, 2024
6 min read

Hassle-free LLM Fine-tuning with FriendliAI and Weights & Biases

In this article, we will demonstrate how to fine-tune generative AI models using Friendli Dedicated Endpoints along with Weights & Biases. Friendli Dedicated Endpoints provide streamlined fine-tuning and deployment of generative AI models. Friendli Dedicated Endpoints is available on the Friendli Suite, which serves as a platform for serving custom large language models (LLMs).

Weights & Biases (W&B) is a popular AI developer platform that supports developers to train and fine-tune models, and manage them from experimentation to production throughout the LLM application development process. W&B offers ML teams a way to track and version their experiments automatically, providing convenience in discovering and reproducing various ML experiments and pipelines.

Benefits of Using W&B with Friendli Dedicated Endpoints

Integrating W&B for fine-tuning on Friendli Dedicated Endpoints offers a smooth and impactful user experience. This integration allows teams to effortlessly monitor and collaborate on their fine-tuning activities. All it takes is sharing the W&B API key and project name to get started.

In this article, we will explain how to create and monitor fine-tuning jobs on Friendli Dedicated Endpoints with Weights & Biases. Follow along with this how-to guide to learn how to:

Upload and manage your datasets efficiently on Friendli Dedicated Endpoints.
Initiate and monitor fine-tuning jobs with W&B.
Launch and monitor multiple jobs and share their reports.

By the end of this guide, you will understand how you can effectively fine-tune your LLMs by using Friendli Dedicated Endpoints and W&B.

Simplifying Fine-tuning for Generative AI

In the realm of generative AI, a "one-size-fits-all" approach rarely suffices. Fine-tuning models using enterprise-specific data is a critical step for AI application development, yet the process can often be cumbersome.

Developers repetitively face time-consuming challenges: setting up GPUs, uploading datasets, adjusting hyperparameters, tracking progress, and evaluating models. These tasks can detract from the primary goal of fine-tuning, making the process more complex than necessary.

This is where the integration of Friendli Dedicated Endpoints with W&B comes into play. Fine-tuning on Friendli Dedicated Endpoints streamlines setup and management, allowing developers to concentrate on enhancing model performance. Consequently, businesses can achieve tailored solutions that provide a competitive edge in the market.

How to upload your model and dataset

To access your W&B model artifacts via Friendli Dedicated Endpoints, configure your W&B API key in your personal settings in the Friendli Suite. For detailed instructions on uploading your model as a W&B model artifact, check out our previous blog post on the W&B and FriendliAI integration.

Navigate to the ‘Datasets’ section within your dedicated endpoints project page to upload your fine-tuning dataset. Enter the dataset name, then either drag and drop your .jsonl training and validation files or browse for them on your computer. If your files meet the required criteria, the blue 'Upload' button will be activated, allowing you to complete the process.

Upload dataset

Upload a new dataset

Uploaded dataset

Uploaded dataset named ‘FriendliAI-GSM8K’

You can access our example dataset ‘FriendliAI/gsm8k’ on Hugging Face and explore some of our quantized generative AI models on our Hugging Face page.

How to create your fine-tuning job

This section demonstrates the process of creating a fine-tuning job in your dedicated endpoints project. After selecting a project, navigate to the fine-tuning page to see an overview of all your fine-tuning jobs. Press the blue 'New Job' icon at the far right to create a new job.

Then, a page will appear where you can configure your new fine-tuning experiment. You’ll need to enter the following information:

Job name
Model
Dataset
W&B project name
Hyperparameters

Below is an example of a newly configured fine-tuning job named W&B test. Select ‘Weights & Biases’ as the base model. If you haven’t integrated your W&B API key with your Friendli Suite account yet, you’ll be asked to enter it here. Provide the full name of the W&B model artifact and verify that your integrated W&B account has access to the selected model.

Wandb model artifact

Enter a job name (e.g. ‘W&B test’) and Select a W&B model artifact

Next, choose the dataset that you have previously uploaded to the endpoint project.

Select dataset

Select a dataset

Then, enter a W&B project name to monitor your fine-tuning job. If you provide a project name that already exists, your job will be added to that project. Otherwise, a new W&B project will be automatically created in your integrated W&B account.

Wandb project

Enter a W&B project name (e.g. ‘W&B project’)

Lastly, enter the training hyperparameters for fine-tuning your model. Since we support LoRA fine-tuning, you can also configure the related parameters. Once all the values are entered, click the blue 'Create' button to proceed.

Enter hyperparameters

Enter hyperparameter values

Hurray! The ‘W&B test’ fine-tuning job has been launched! After the initialization finishes, the job will transition to the training state. You can monitor the job’s progress in a few minutes as shown below.

You can now switch over to the W&B project site to see important metrics like training loss, mean token accuracy, and more. Details on monitoring your fine-tuning job will be covered in the next section.

Launched fine-tuning job

Launched fine-tuning job ‘W&B test’

How to monitor your fine-tuning job

This section explains how to monitor your fine-tuning job on the W&B platform. In addition, we will discuss the W&B report feature, a helpful tool that allows developers to share insights on the training process.

First, log in to your integrated W&B account and find the relevant project. We will choose the ‘W&B project,’ as it is the project name we previously assigned when launching the fine-tuning job.

Select project

Select the integrated project ‘W&B project’

You can then view a panel with real-time graph visualizations on key training metrics. Some useful metrics you can monitor include gradient norm, learning rate, and training loss.

Project monitoring panel

W&B enables you to write and share reports on training progress directly on the platform. We created an example ‘Llama 3 8B Fine-tuning Report’ that points out a supposedly alarming accuracy drop early in the fine-tuning process.

Ftj progress report

Example of a fine-tuning progress report

W&B can alert you when training accuracies fall below a specified threshold, minimizing the need to constantly monitor training metrics or worry about crashes. If you decide to terminate the fine-tuning job early, you can do so from our Friendli Dedicated Endpoints fine-tuning page by pressing the ‘Cancel’ icon.

How to duplicate your fine-tuning job

Last but not least, if you want to rerun the training with a different configuration, you can duplicate the job to create a new fine-tuning experiment by pressing the ‘Duplicate’ icon.

Launched fine-tuning job

Newly launched ‘Clone of W&B test’

When launching multiple fine-tuning jobs in a single W&B project, you can view and compare all the graphs together to identify the best fine-tuning configuration or compare different models. For instance, observe how the duplicated job, which uses a different initial learning rate and batch size, is yielding faster training results below!

Monitor fine-tuning job

Monitoring multiple fine-tuning jobs

Deploying the Fine-tuned Model

The steps to deploy the fine-tuned model are equivalent to how you would deploy a custom model on Friendli Dedicated Endpoints. For further information, please refer to our documentation to launch a model, or our blog post for more detailed information on directly deploying a W&B model artifact on Friendli Dedicated Endpoints.

Conclusion

Fine-tuning LLMs with FriendliAI and Weights & Biases is a user-friendly way to customize your AI models. By integrating Friendli Dedicated Endpoints with W&B, you can easily manage and monitor your fine-tuning workflows.

This guide has shown you how to upload datasets, set up fine-tuning jobs, and monitor progress. The combination of Friendli Dedicated Endpoints and W&B's tracking tools ensures you achieve optimal results with minimal effort. At FriendliAI, we are motivated to provide reliable fine-tuning services, supporting your AI development with innovative solutions.

Written by

FriendliAI Tech & Research

General FAQ

What is FriendliAI?

FriendliAI is a GPU-inference platform that lets you deploy, scale, and monitor large language and multimodal models in production, without owning or managing GPU infrastructure. We offer three things for your AI models: Unmatched speed, cost efficiency, and operational simplicity. Find out which product is the best fit for you in here.

How does FriendliAI help my business?

Our Friendli Inference allows you to squeeze more tokens-per-second out of every GPU. Because you need fewer GPUs to serve the same load, the true metric—tokens per dollar—comes out higher even if the hourly GPU rate looks similar on paper. View pricing

Which models and modalities are supported?

Over 380,000 text, vision, audio, and multi-modal models are deployable out of the box. You can also upload custom models or LoRA adapters. Explore models

Can I deploy models from Hugging Face directly?

Yes. A one-click deploy by selecting “Friendli Endpoints” on the Hugging Face Hub will take you to our model deployment page. The page provides an easy-to-use interface for setting up Friendli Dedicated Endpoints, a managed service for generative AI inference. Learn more about our Hugging Face partnership

Still have questions?

If you want a customized solution for that key issue that is slowing your growth, contact@friendli.ai or click Talk to an expert — our experts (not a bot) will reply within one business day.