July 18, 2024
4 min read

Friendli Tools Part 1: Function Calling—Connecting LLMs with Functions and APIs

Friendli Tools Series: Part 1 of 3

Function calling, first introduced by OpenAI, enables language models to interact with existing software infrastructures via function interfaces. As part of our series of blog posts about the new Friendli Tools release, we will learn about the basics of function calling for language models in this post.

Function calling

What are LLM Function Calls?

Have you ever wished to make a large language model(LLM) do something more specific, like summarizing relevant information from a Google search? One easy solution is to make the model use an existing API or a particular function of your choice. LLMs leverage their capability to interpret human language, to be able to call appropriate modular functions that retrieve information or take a specific action. LLMs can decide when and to which program to make a function call with appropriate function arguments based on the context of the conversation with the user.

According to the research Gorilla: Large Language Model Connected with Massive APIs, this advancement can be as significant as “transforming LLMs into the primary interface to computing infrastructure and the web”. When empowered by function calls, LLMs can be the brains of agentic systems. Imagine booking your entire vacation by talking to an agent that accesses hotel, flight, weather, and entertainment web APIs. Alternatively, consider meal prepping for the week by talking to an agent that accesses calendar, refrigerator, and shopping web APIs. You could even write a scholarly book and market it with the help of an agent that accesses academic databases, email, and social media APIs.

The Basic Process of Function Calls

Let’s start by taking a look at this chat completion request.


I live in Seoul. What should I wear for today's weather?

An accurate response to this question would need information on the current weather in Seoul which is unavailable from pretrained LLMs. This is where function calling comes into play.

Step 1. Get functions to add as tools

We can use a function like the ‘get_current_weather’ function below to obtain current temperatures at certain locations. The example function is hardcoded for explanatory purposes and real tools would use actual weather APIs.

python
def get_current_weather(location, unit="fahrenheit"):
    """Get the current weather in a given location"""
    if "seoul" in location.lower():
        return 10
    elif "san francisco" in location.lower():
        return 72
    elif "paris" in location.lower():
        return 22
    else:
        return 20

Step 2. Convert functions into JSON schemas

To incorporate functions as LLM tools, you need to provide an array of your functions in a JSON format. Some libraries, such as Pydantic, help convert coded functions into JSON schemas. An example of the ‘get_current_weather’ function in a JSON format is:

python
tools=[
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Get the current weather information at a given location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The name of current location.",
                    },
                    "unit": {
                        "type": "string",
                        "description": "The unit of temperature degree.",
                    },
                },
            },
        },
    },
]

For function calling, the descriptions of the function and its argument are the context that the LLM receives within its prompts, to infer when and how to use the functions. Using these descriptions, the LLM identifies the most suitable tool among its available tools and does not need to consider the actual code of each function. Accurately describing the function and function parameters is an essential part of providing functions as tools.

Step 3. Structured output containing function calls

LLMs were originally designed to generate text for humans. However, some models have been trained to output formatted data, such as in JSON values, which can be easily interpreted by programming languages. Similarly, models that are trained for function calling are generally expected to respond in a JSON schema. Information about the tool that is to be chosen to be called along with its arguments can be found in the JSON output. An example of a response in a JSON format is:

python
tool_calls = [
    ChatCompletionMessageToolCall(
        id='call_mB12u83Yy6OyaY0CJSLAADDt',
        function=Function(
            name='get_current_weather',
            arguments='{"location":"Seoul","unit":"Celsius"}'

        ),
        type='function'
    )
]

Step 4. Invoking the function

The function itself has to be invoked in a separate code outside of the LLM. This can be easily coded thanks to the structured output functionality. The model will generate a response by incorporating the function result when the output is given back to the model.

python
import json

func_kwargs = json.loads(chat.choices[0].message.tool_calls[0].function.arguments)
weather_info = get_current_weather(**func_kwargs)

Although it is tricky to strictly ensure an output format on LLMs, as they base their inference on the probabilities of the next tokens, the Friendli Inference, which powers all Friendli products—including Friendli Container, Dedicated Endpoints, and Serverless Endpoints—is engineered to ensure structured outputs with a strong guarantee. If you want to read further on our structured outputs, our blog post Introducing Structured Output on Friendli Inference for Building LLM Agents is a great place to start.

Step 5. The final generated response


It looks like the current weather in Seoul is 10°C. According to the weather forecast, you should wear a lightweight jacket and comfortable clothing for outdoor activities, considering the mild temperature. Don't forget to grab a light scarf for a breeze, if you plan to be outside for an extended period.

The chat model acquired information on the current temperature of Seoul, 10 degrees Celsius, through function calling. Without the capability to access the ‘get_current_weather’ tool, the model would have been highly likely to suffer from hallucination with incorrect weather information in its response.

Full Python Script


$ pip install openai
$ export FRIENDLI_TOKEN=[FILL_IN_YOUR_TOKEN]

python
import os
import json
import openai

def get_current_weather(location, unit="fahrenheit"):
    """Get the current weather in a given location"""
    if "seoul" in location.lower():
        return 10
    elif "san francisco" in location.lower():
        return 72
    elif "paris" in location.lower():
        return 22
    else:
        return 20

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Get the current weather information at a given location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The name of current location.",
                    },
                    "unit": {
                        "type": "string",
                        "description": "The unit of temperature degree.",
                    },
                },
            },
        },
    },
]

messages=[
    {
        "role": "user",
        "content": "I live in Seoul. What should I wear for today's weather?",
    },
]

client = openai.OpenAI(
    api_key=os.getenv("FRIENDLI_TOKEN"),
    base_url="https://api.friendli.ai/serverless/v1",
)

chat = client.chat.completions.create(
    model="meta-llama-3.1-8b-instruct",
    messages=messages,
    tools=tools,
    temperature=0,
    frequency_penalty=1,
)

func_kwargs = json.loads(chat.choices[0].message.tool_calls[0].function.arguments)
weather_info = get_current_weather(**func_kwargs)

messages.append(
    {
        "role": "assistant",
        "tool_calls": [chat.choices[0].message.tool_calls[0].model_dump()],
    }
)

messages.append(
    {
        "role": "tool",
        "content": str(weather_info),
        "tool_call_id": chat.choices[0].message.tool_calls[0].id,
    }
)

chat_w_info = client.chat.completions.create(
    model="meta-llama-3.1-8b-instruct",
    tools=tools,
    messages=messages,
)

print(chat.choices[0].message.tool_calls)
print(chat_w_info.choices[0].message.content)

Conclusion

In conclusion, the function calling feature allows LLMs to recognize when a task requires an external API or a specific code execution to intelligently collect results from a function call with its arguments. This enhances the AI's ability to perform complex tasks and interact with external systems, transforming LLMs into LLM agents.

As we look ahead, we're excited to announce the upcoming release of the Friendli Tools, designed to simplify and optimize function calling. Stay tuned for part 2 and part 3 of this blog series, providing with more updates and detailed guides on leveraging these new tools to unlock the full potential of your AI solutions.

Written by

FriendliAI Tech & Research

General FAQ

What is FriendliAI?

FriendliAI is a GPU-inference platform that lets you deploy, scale, and monitor large language and multimodal models in production, without owning or managing GPU infrastructure. We offer three things for your AI models: Unmatched speed, cost efficiency, and operational simplicity. Find out which product is the best fit for you in here.

How does FriendliAI help my business?

Our Friendli Inference allows you to squeeze more tokens-per-second out of every GPU. Because you need fewer GPUs to serve the same load, the true metric—tokens per dollar—comes out higher even if the hourly GPU rate looks similar on paper. View pricing

Which models and modalities are supported?

Over 380,000 text, vision, audio, and multi-modal models are deployable out of the box. You can also upload custom models or LoRA adapters. Explore models

Can I deploy models from Hugging Face directly?

Yes. A one-click deploy by selecting “Friendli Endpoints” on the Hugging Face Hub will take you to our model deployment page. The page provides an easy-to-use interface for setting up Friendli Dedicated Endpoints, a managed service for generative AI inference. Learn more about our Hugging Face partnership

Still have questions?

If you want a customized solution for that key issue that is slowing your growth, contact@friendli.ai or click Talk to an expert — our experts (not a bot) will reply within one business day.