Tool calling with Serverless Endpoints

Goals

Use tool calling to build your own AI agent with Friendli Serverless Endpoints
Check out the examples below to see how you can interact with state-of-the-art language models while letting them search the web, run Python code, etc.
Feel free to make your own custom tools!

Getting Started

Head to https://friendli.ai, and create an account.
Grab a Friendli Token to use Friendli Serverless Endpoints within an agent.

🚀 Step 1. Playground UI

Experience tool calling on the Playground!

On your left sidebar, click the ‘Serverless Endpoints’ option to access the playground page.
You will see models that can be used as Serverless Endpoints. Choose the one you want and select the endpoint.
Click ‘Tools’ button, select Search tool, and enter a query to see the response. 😀

🚀 Step 2. Tool Calling

Search interesting information using the web:search tool. This time, let’s try it by writing python code.

Add the user’s input as an user role message.
Add the web:search tool to the tools option.

# pip install friendli

import os
from friendli import SyncFriendli

with SyncFriendli(
    token=os.getenv("FRIENDLI_TOKEN", ""),
) as friendli:
    res = friendli.serverless.tool_assisted_chat.complete(
        model="meta-llama-3.1-8b-instruct",
        messages=[
            {
                "role": "user",
                "content": "Find information on the popular movies currently showing in theaters and provide their ratings.",
            },
        ],
        tools=[{"type": "web:search"}],
        max_tokens=200,
    )

    print(res)

🚀 Step 3. Multiple tool calling

Use multiple tools at once to calculate “How long it will take you to buy a house in the San Francisco Bay Area based on your annual salary”. Here is the available built-in tools.

math:calculator (tool for calculating arithmetic operations)
web:search (tool for retrieving data through the web search)
code:python-interpreter (tool for writing and executing python code)

Example Answer sheet

Prompt: My annual salary is $ 100k. How long it will take to buy a house in San Francisco Bay Area? (`web:search` & `math:calculator` used)

Answer: Based on the web search results, the median price of an existing single-family home in the Bay Area is around $1.25 million.
Using a calculator to calculate how long it would take to buy a house in the San Francisco Bay Area with an annual salary of $100,000, we get:
$1,200,000 (house price) / $100,000 (annual salary) = 12 years
So, it would take approximately 12 years to buy a house in the San Francisco Bay Area with an annual salary of $100,000,
assuming you save your entire salary each year and don't consider other factors like interest rates, taxes, and living expenses.

🚀 Step 4. Build a custom tool

Build your own creative tool. We will show you how to make a custom tool that retrieves temperature information. (Completed code snippet is provided at the bottom)

Define a function for using as a custom tool

def get_temperature(location: str) -> int:
    """Mock function that returns the city temperature"""
    if "new york" in location.lower():
        return 45
    if "san francisco" in location.lower():
        return 72
    return 30

Send a function calling inference request

Add the user’s input as an user role message.
The information about the custom function (e.g., get_temperature) goes into the tools option. The function’s parameters are described in JSON schema.
The response includes the arguments field, which are values extracted from the user’s input that can be used as parameters of the custom function.

# pip install friendli

import os
from friendli import SyncFriendli

token = os.environ.get("FRIENDLI_TOKEN") or "YOUR_FRIENDLI_TOKEN"
client= SyncFriendli(token=token)
user_prompt = "I live in New York. What should I wear for today's weather?"

messages = [
    {
        "role": "user",
        "content": user_prompt,
    },
]

tools=[
    {
        "type": "function",
        "function": {
            "name": "get_temperature",
            "description": "Get the temperature information in a given location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The name of current location e.g., New York",
                    },
                },
            },
        },
    },
]

chat = client.serverless.chat.complete(
    model="meta-llama-3.3-70b-instruct",
    messages=messages,
    tools=tools,
    temperature=0,
    frequency_penalty=1,
)

print(chat)

Generate the final response using the tool calling results

Add the tool_calls response as an assistant role message.
Add the result obtained by calling the get_weather function as a tool message to the Chat API again.

import json

func_kwargs = json.loads(chat.choices[0].message.tool_calls[0].function.arguments)
temperature_info = get_temperature(**func_kwargs)

messages.append(
    {
        "role": "assistant",
        "tool_calls": [
            tool_call.model_dump()
            for tool_call in chat.choices[0].message.tool_calls
        ]
    }
)
messages.append(
    {
        "role": "tool",
        "content": str(temperature_info),
        "tool_call_id": chat.choices[0].message.tool_calls[0].id
    }
)

chat_w_info = client.serverless.chat.complete(
    model="meta-llama-3.3-70b-instruct",
    tools=tools,
    messages=messages,
)

for choice in chat_w_info.choices:
    print(choice.message.content)

Complete Code Snippet

# pip install friendli

import json
import os
from friendli import SyncFriendli

token = os.environ.get("FRIENDLI_TOKEN") or "YOUR_FRIENDLI_TOKEN"
client = SyncFriendli(token=token)
user_prompt = "I live in New York. What should I wear for today's weather?"

messages = [
    {
        "role": "user",
        "content": user_prompt,
    },
]

tools=[
    {
        "type": "function",
        "function": {
            "name": "get_temperature",
            "description": "Get the temperature information in a given location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The name of current location e.g., New York",
                    },
                },
            },
        },
    },
]

chat = client.serverless.chat.complete(
    model="meta-llama-3.3-70b-instruct",
    messages=messages,
    tools=tools,
    temperature=0,
    frequency_penalty=1,
)

def get_temperature(location: str) -> int:
    """Mock function that returns the city temperature"""
    if "new york" in location.lower():
        return 45
    if "san francisco" in location.lower():
        return 72
    return 30

func_kwargs = json.loads(chat.choices[0].message.tool_calls[0].function.arguments)
temperature_info = get_temperature(**func_kwargs)

messages.append(
    {
        "role": "assistant",
        "tool_calls": [
            tool_call.model_dump()
            for tool_call in chat.choices[0].message.tool_calls
        ]
    }
)
messages.append(
    {
        "role": "tool",
        "content": str(temperature_info),
        "tool_call_id": chat.choices[0].message.tool_calls[0].id
    }
)

chat_w_info = client.serverless.chat.complete(
    model="meta-llama-3.3-70b-instruct",
    tools=tools,
    messages=messages,
)

for choice in chat_w_info.choices:
    print(choice.message.content)

🎉 Congratulations!

Following the above instructions, we’ve experienced the whole process of defining and using a custom tool to generate an accurate and rich answer from LLM models! Brainstorm creative ideas for your agent by reading our blog articles!

Tutorials

​Goals

​Getting Started

​🚀 Step 1. Playground UI

​🚀 Step 2. Tool Calling

​🚀 Step 3. Multiple tool calling

​Example Answer sheet

​🚀 Step 4. Build a custom tool

​🎉 Congratulations!