Function Calling
by Kacper Schnetzer, AI Engineer & Data Scientist
Function Calling: What is it and how to use it?
Function Calling is a technology that enables intelligent language models to execute application/system functions based on the analysis of provided context and data. This allows for building more interactive, dynamic, and practical applications. By integrating with functions created by developers, models can solve specific problems rather than just generatetext responses. An example of a model supporting Function Calling technology is Llama-3.3-70B-Instruct, which is available in the Sherlock Cloudferro service.
Do all language models allow the use of Function Calling?
Function Calling is currently becoming a standard in new language models. However, it's important to note that not all models support this technology yet.
Application example
A user of a chat on a pizzeria's website wants to order a Margherita pizza. They express their desire in the chat, providing their first name, last name, and the name of the pizza. A traditional chat system can only respond to the user in a text form ("Unfortunately, I cannot place this order for you, I am here only to help you choose a pizza."). With Function Calling, the chat system can place an order for the user before providing a response (assuming that the order placement functionality is available to the chat system and it has all the required data). Only after that will it respond in text form - "Your order has been placed."
Why should we use Function Calling?
In a traditional approach, language models are limited to responding with text. However, many applications require more complex actions, such as:
- Extracting key information from a database and processing it (e.g. displaying a list of orders placed by a user in an online store);
- Responding to user needs by calling appropriate application functions (e.g., placing an order in an online store when the user expresses such a desire in the chat);
- Communicating with external APIs when needed (e.g. retrieving weather data in response to a user's question "What will be the weather tomorrow in Kraków?").
How to use Function Calling in your own applications?
Integrating Function Calling with existing applications is exceptionally simple. The functionalities that the developer wants to make available for potential use by a language model handling, for example, a chat must have a description in the appropriate JSON structure, and the communication process with the language model must be supplemented with application function calls handling . To better understand the integration of Function Calling with an existing system, we will use the example of an online pizza ordering system.
Let's imagine a pizzeria's website that allows users to order pizza via chat, handled by a language model. Thanks to Function Calling, the language model can:
- Understand user requests such as "I would like to order two Pepperoni pizzas and one Margherita."
- Execute functions responsible for placing an order and register it in the system.
- Display a list of a user's current orders upon request by retrieving it from the database.
Function definitions
The order placement application is built using two functions. One allows to place an order (place_order
), and the other enables retrieving a list of orders placed by a specific user (get_orders
).
def place_order(user, products):
order = {
"user": user,
"products": products,
"status": "accepted"
}
database["orders"].append(order)
return {
"message": f"Order has been placed: {products}",
"order_id": len(database["orders"]) - 1
}
def get_orders(user):
user_orders = [o for o in database["orders"] if o["user"] == user]
if not user_orders:
return {"message": "You don't have any orders."}
return {
"orders": [
{
"order_id": idx,
"products": o["products"],
"status": o["status"]
}
for idx, o in enumerate(user_orders)
]
}
Database
The application uses a database, which in this case is imitated by the following object:
database = {
"orders": [
{"user": "john_doe", "products": ["Margherita"], "status": "accepted"},
{"user": "jane_smith", "products": ["Pepperoni", "Napoletana"], "status": "accepted"}
]
}
Function definitions for the model
An important step, as mentioned earlier, is to create a description of the functions made available to the model by creating an appropriate JSON structure. This allows the language model to understand which functions are available in our system, what parameters they require, and enables it to make decisions about their use.
tools = [
{
"type": "function",
"function": {
"name": "place_order",
"description": "Places a pizza order.",
"parameters": {
"type": "object",
"properties": {
"user": {
"type": "string",
"description": "First and last name of the user placing the order.",
},
"products": {
"type": "array",
"description": "Names of ordered pizzas.",
"items": {"type": "string"},
},
},
"required": ["user", "products"],
},
},
},
{
"type": "function",
"function": {
"name": "get_orders",
"description": "Returns a list of placed orders.",
"parameters": {
"type": "object",
"properties": {
"user": {
"type": "string",
"description": "First and last name of the user.",
}
},
"required": ["user"],
},
},
},
]
Communication with the model
The next stage is handling communication with the language model. In response to a user's query, the model can provide a standard text response, but it can also request execution of one or more functions instead. In the latter case, instead of immediately returning a response to the user, functions selected by the model should be executed first, and then the return values from these functions should be added to the chat history. Finally, a language model response is generated on the chat, to the history of which the responses from the executed functions have been added. The entire process is broken down into more detailed fragments below.

Let's start by creating a simple script that connects to the model API using the openai
library. This enables communication not only with the API provided by OpenAI but also with APIs provided by other suppliers, including Cloudferro in the Sherlock service.
import openai
client = openai.OpenAI(
base_url="https://api-sherlock.cloudferro.com/openai/v1",
api_key="XUznNjfktdvQpebkfnzLmvdaEpBQJ",
)
Next, for the sake of example, let's create a list of messages in the chat history, which will contain only one message where the user expresses a desire to place a pizza order.
chat_history = [
{"role": "user", "content": "I would like to place an order for a Margherita pizza under the name Jan Nowak."}
]
Now let's generate a response to this chat history. Since the user is explicitly asking to place an order in the system, the response will not be text but rather a function call instruction.
completion = client.chat.completions.create(
model="Llama-3.3-70B-Instruct",
messages=chat_history,
tools=tools,
)
As a response, we receive an object containing the following particularly important data:
completion.choices[0].message.content
- the text of the model's response (an empty string value if the model orders a Function Calling);completion.choices[0].message.tool_calls
- a list of functions to execute (is empty in case of a standard response).
The response (completion
) also contains extra information that goes beyond the scope of this article. A full description of the object returned by client.chat.completions.create
can be found in the openai
library documentation.
Typically, the model's response should be placed in the chat history.
chat_history.append(completion.choices[0].message)
In our case, when the user asks to place a pizza order, the model should of course use the place_order
function. To verify that this is indeed happening, you should check if completion
contains a list completion.choices[0].message.tool_calls
. If tool_calls
takes a value of None
, it means the model did not request execution of any functions. Otherwise, we get a list of functions to call, which in our example looks as follows:
[
ChatCompletionMessageToolCall(
id="RpElTEjLK",
function=Function(
arguments='{"products": ["Margheritta"], "user": "Jan Nowak"}',
name="place_order",
),
type=None,
index=0,
)
]
The next step is to call the functions from the list above. By default, this should happen automatically. Implementation of that is the responsibility of the developer of the system in which the Function Calling functionality is embedded. However, for the purposes of the example, we will skip this stage as it is not directly related to Function Calling but to the logic of the system's operation. Therefore, we proceed directly to calling the place_order
function, which is the only one that appeared on the list in the model's response.
import json
tool_call = completion.choices[0].message.tool_calls[0]
arguments = json.loads(tool_call.function.arguments)
function_call_result = str(
place_order(user=arguments["user"], products=arguments["products"]),
)
Remember that the value returned from the function should be converted to a string type, as this value will be reintroduced into the chat history, which is a list of text data.
The value returned from the place_order
function: {'message': "Order has been placed: ['Margheritta']", 'order_id': 2}
The result of the function call is placed in the chat history. Note that the role
field should contain the word tool
, not assistant
, which is only used for textual responses from the model. Additionally, the tool_call_id
parameter should be included, which is a reference to the appropriate function call instruction from completion.choices[0].message.tool_calls
.
chat_history.append(
{
"role": "tool",
"tool_call_id": tool_call.id,
"content": function_call_result,
}
)
Finally, after enriching the chat history with the function call request and the values returned by the functions, we generate another model response.
completion = client.chat.completions.create(
model="Llama-3.3-70B-Instruct",
messages=chat_history,
tools=tools,
)
The returned response:
The Margheritta pizza order has been placed under the name Jan Nowak. The order ID is 2.
In this simple example, the model requested the execution of only one function. However, there may be a situation where the list of functions to execute is longer. In that case, it is necessary to call each function and place as many elements in the chat history as there were functions to call.
Example:
tool_calls = completion.choices[0].message.tool_calls
...
# here call each function
...
function_call_results = [...] # list of values returned by each function
for value, tool_call in zip(function_call_results, tool_calls):
chat_history.append(
{
"role": "tool",
"tool_call_id": tool_call.id,
"content": str(value),
}
)
It is possible for a parameter of one function to be a value returned by another function. Let's assume there is a function that allows converting a user ID to their first and last name.
id_to_name = {
"0": "Jan Nowak"
}
def get_name_from_id(user_id):
return id_to_name[user_id]
In the pizzeria example, if the user did not provide their first and last name, but instead their identifier, the model should first instruct the execution of one function - get_name_from_id
. Only in the next step should it ask the system to execute the place_order
function. Whether this actually happens depends on the quality of the model used and its reasoning abilities, so it is important to safeguard against incorrect parameters given to function calls. In case of an error during a function call, a good practice is to return a clear description to the model of why the function was not executed correctly.
Streaming
The openai
library allows for communication with the language model API in two modes:
- standard mode (stream = False) - the API response contains all the generated text;
- streaming mode (stream = True) - the API response is streamed in segments (one application of this mode is dynamic displaying of the model's response in the chat, in real-time, token by token, as soon as tokens are generated).
When the response is streamed in segments, the Function Calling request is also dynamically divided into segments, in real-time during response generation.
Example: Sherlock API
chat_history = [
{
"role": "user",
"content": "I would like to place an order for a Margherita pizza under the name Jan Nowak.",
}
]
stream = client.chat.completions.create(
model="Llama-3.3-70B-Instruct", messages=chat_history, tools=tools, stream=True
)
for chunk in stream:
delta = chunk.choices[0].delta
print(delta, end="\n\n")
Program output:
ChoiceDelta(content='', function_call=None, refusal=None, role='assistant', tool_calls=None)
ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id='chatcmpl-tool-2e7340ad8926437199ebcfd64bcd8090', function=ChoiceDeltaToolCallFunction(arguments=None, name='place_order'), type='function')])
ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='{"user": "', name=None), type=None)])
ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='Jan', name=None), type=None)])
ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments=' Now', name=None), type=None)])
ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='ak"', name=None), type=None)])
ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments=', "products": ["', name=None), type=None)])
ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='Marg', name=None), type=None)])
ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='her', name=None), type=None)])
ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='itta"]}', name=None), type=None)])
ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='', name=None), type=None)])
Summary
Function Calling is becoming a standard in modern large language models (LLMs). As the examples above show, it allows to achieve an effect that was previously much more complicated to get and involved a higher error threshold. Modern LLMS models are highly effective at the Function Calling task because they have been trained exactly for this purpose on appropriate datasets.
Function calling increases the possibilities of integrating language models with existing systems and enables the use of these systems in the most natural way - by using the language that humans communicate with. Thanks to these tools, applications will become even easier to use. Language models will become intelligent intermediaries between the user and the system. This mechanism will significantly simplify performing activities in various types of applications - from pizza ordering systems to handling official matters online.
Explore Sherlock platform
Discover a fully managed Generative AI platform with OpenAI-compatible endpoints, enabling seamless integration of advanced AI capabilities into your applications. Access high-performing language models through a unified API, eliminating the complexity of infrastructure management and model operations.