Generative AIBuilding with foundation modelsOpenAI platform

Function calls with chat completions

19 minutes read

Function calling in chat completions API allows to define specific functions that the LLM can identify and suggest calling when appropriate. When the model determines a function should be called, it outputs a formatted JSON object containing the function name and parameters, which can then be executed by the application to perform certain actions (such as data retrieval).

In this topic, we will look at function calling with OpenAI's chat completions API.

What are function calls

Programming functions can be compared to interface controls: they receive input and perform specific actions in response. In the context of chat completions API, function calls represent structured ways for the model to indicate when specific programmatic actions should be taken.

Rather than directly executing code, the model generates structured JSON output describing the intended function call. This output can then be processed by the application to execute the actual function, allowing controlled interaction between the LLM and external systems.

Function calls in chat completions enable:

Access to external databases and APIs for real-time information
Execution of programmatic operations like scheduling
Handling of user input with structured responses and actions.

Setting up a function

Suppose you have written a function for a hospital management program that, when given a patient's ID, reveals the current status of their appointment:

appointment_status = {
    '12345': 'Confirmed',
    '67890': 'Pending',
    '54321': 'Cancelled',
    '98765': 'Completed'
}

def get_appointment_status(patient_id):
    status = appointment_status.get(patient_id, 'No Appointment Found')
    return status

The function integration enables the API to trigger get_appointment_status() with appropriate patient IDs when queried about appointments, rather than generating a random answer. Functions are defined through a list of JSON objects. Here is the structure:

functions_list = [
    {
        "type": "function",
        "function": {
            "name": "get_appointment_status",
            "description": "Get the appointment status of a patient",
            "parameters": {
                "type": "object",
                "properties": {
                    "patient_id": {"type": "string", "description": "The patient ID"}
                },
                "required": ["patient_id"],
            },
        },
    }
]

Here are the properties in the JSON object to notice:

"name" indicates the calling name of the function, and "description" has the details about when to call that function.

Inside "parameters" we list all the function parameters inside "properties" in this format:

"properties": {
    "parameter_name_1":{
        "type": (data type of the parameter 1),
        "description": (Identifying details of the parameter 1)},
    "parameter_name_2":{
        "type": (data type of the parameter 2),
        "description": (Identifying details of the parameter 2)},
    ... (continue for all parameters)
}

"required" is the list of all required parameters. Here is the structure:

"required": [
    "parameter_name_1",
    ... (list all required parameters)
]

This JSON object specifies the function's intent and requirements. The latest models are trained to detect when such a function should be triggered based on the user's input. This allows for a more intuitive interaction with the API, as the model can infer the need for specific functions without explicit instruction.

Calling a function

Let's create the get_chat_completion() function. We have to include tools and tool_choice parameters to enable function calling:

tools will have the list of functions.
tool_choice will be set to "auto". This will enable the LLM to automatically decide which function(s) to call from the list based on the context.

import json
import os
from dotenv import load_dotenv
from openai import OpenAI

load_dotenv()

client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))

def get_chat_completion(messages, tools=None, tool_choice="auto"):
    return client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=messages,
        tools=tools,
        tool_choice=tool_choice,
    )

Let's ask the model about the appointment status without providing the patient ID. It will respond with some clarifying questions.

messages = [
    {
        "role": "system",
        "content": "Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous.",
    },
    {"role": "user", "content": "What's the appointment status?"},
]

chat_response = get_chat_completion(messages, tools=functions_list)
assistant_message = chat_response.choices[0].message
messages.append(assistant_message)
print(assistant_message)

The output

ChatCompletionMessage(
    content="I need more information to provide the appointment status. Could you please provide the patient ID for whom you would like to know the appointment status?",
    refusal=None,
    role="assistant",
    audio=None,
    function_call=None,
    tool_calls=None,
)

So, the LLM asks for the patient ID to respond to the user message. Also, see that tool_calls=None indicates that we don't have to make any function call to assist GPT. Let's provide the patient ID in our next prompt:

messages.append({"role": "user", "content": "The patient id is 67890."})
chat_response = get_chat_completion(messages, tools=functions_list)
assistant_message = chat_response.choices[0].message
messages.append(assistant_message)
print(assistant_message)

The output

ChatCompletionMessage(
    content=None,
    refusal=None,
    role="assistant",
    audio=None,
    function_call=None,
    tool_calls=[
        ChatCompletionMessageToolCall(
            id="call_<id>",
            function=Function(
                arguments='{"patient_id":"67890"}', name="get_appointment_status"
            ),
            type="function",
        )
    ],
)

Now we see that content=None which indicates that GPT has not generated any response for the user. Instead, it provides a tool_calls object with the necessary information to call a function. We have to call the function accordingly to assist GPT. Let's understand what is written inside the tool_calls:

id is used to identify the tool_calls. Later we will use it to pass the result after calling the function.
arguments is a JSON object containing the function parameter name "patient_id" as the key and its value is "67890"
name is a string indicating the function to call.

Now, let's run the function using the tool_calls details:

# First create a dictionary to map the function name string to actual function
function_list = {
    "get_appointment_status":get_appointment_status
}

# Extract the tool_calls content
tool_call = assistant_message.tool_calls[0]

# Extract the function name and the arguments
function_name = tool_call.function.name
function_args = json.loads(tool_call.function.arguments)

# Get the actual function
function_to_call = function_list[function_name]

# Call the function with the parameter(s) and store the function response
function_response = function_to_call(patient_id = function_args.get("patient_id"))

Now, we have to send the function response back to the GPT so that it can generate an appropriate response for the particular patient:

messages.append({"tool_call_id": tool_call.id,
                 "role": "tool",
                 "name": function_name,
                 "content": function_response,
                })

chat_response = get_chat_completion(messages, tools=functions_list)
assistant_message = chat_response.choices[0].message
messages.append(assistant_message)
print(assistant_message)

Finally, GPT generates the expected response for the patient

ChatCompletionMessage(
    content="The appointment status for patient with ID 67890 is pending.",
    refusal=None,
    role="assistant",
    audio=None,
    function_call=None,
    tool_calls=None,
)

Preparing for parallel function calls

Now, let's look at parallel function calls. Imagine you want to fetch weather data and currency exchange rates simultaneously for a travel dashboard. Parallel function calling allows you to execute multiple functions in one go.

The LLM can request multiple functions simultaneously in a single response (called "parallel function calling" in OpenAI's API, or sometimes "parallel tool use"). However, note that this refers to the request pattern, not the execution. The code example below processes these function calls sequentially using a loop.

The tool_choice="auto" mode enables models to infer which specific function to use based on the conversation context. Let's illustrate this by extending our previous example. We want to create another function that outputs the scheduled time if the patient's appointment status is confirmed. This new function requires the patient ID and the last name of the patient.

appointment_time = {
    '12345': {'time': "10:30 AM, 29th July",
              'l_name': "Smith"} 
}

def get_appointment_time(patient_id, last_name):
    # Check if the patient ID is in the dictionary
    patient_info = appointment_time.get(patient_id)
    
    if patient_info:
        # Check if the last name matches the one stored in the dictionary
        if patient_info['l_name'].lower() == last_name.lower():
            return patient_info['time']
        else:
            return "The last name did not match our records."
    else:
        return "Your appointment has not confirmed yet. Please wait for appointment confirmation."

Here is the updated functions list in JSON format:

functions_list = [
    {
        "type": "function",
        "function": {
            "name": "get_appointment_status",
            "description": "Get the appointment status of a patient",
            "parameters": {
                "type": "object",
                "properties": {
                    "patient_id": {"type": "string", "description": "The patient ID"}
                },
                "required": ["patient_id"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "get_appointment_time",
            "description": "Get the appointment time of a patient.",
            "parameters": {
                "type": "object",
                "properties": {
                    "patient_id": {"type": "string", "description": "The patient ID"},
                    "last_name": {
                        "type": "string",
                        "description": "The last name of the patient.",
                    },
                },
                "required": ["patient_id", "last_name"],
            },
        },
    },
]

Here we have added another JSON object to the functions_list following the format described earlier.

Calling parallel functions

Everything is set for parallel function calling:

messages = [{"role": "system", 
             "content": "Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous."
            }, 
            {"role": "user", 
             "content": "What's the appointment status and time?"
            }]

chat_response = get_chat_completion(messages, tools=functions_list)
assistant_message = chat_response.choices[0].message
messages.append(assistant_message)
print(assistant_message)

The output

ChatCompletionMessage(
    content="Sure, I will need the patient ID and last name in order to retrieve the appointment status and time. Could you please provide me with the necessary information?",
    refusal=None,
    role="assistant",
    audio=None,
    function_call=None,
    tool_calls=None,
)

Let's provide the necessary information:

messages.append({"role": "user", "content": "The patient id is 12345 and the last name is smith."})
chat_response = get_chat_completion(messages, tools=functions_list)
assistant_message = chat_response.choices[0].message
messages.append(assistant_message)
print(assistant_message)

The output

ChatCompletionMessage(
    content=None,
    refusal=None,
    role="assistant",
    audio=None,
    function_call=None,
    tool_calls=[
        ChatCompletionMessageToolCall(
            id="call_<id>",
            function=Function(
                arguments='{"patient_id": "12345"}', name="get_appointment_status"
            ),
            type="function",
        ),
        ChatCompletionMessageToolCall(
            id="call_<id>",
            function=Function(
                arguments='{"patient_id": "12345", "last_name": "smith"}',
                name="get_appointment_time",
            ),
            type="function",
        ),
    ],
)

Now, look at the tool_calls list. It has multiple functions with corresponding id and arguments. We have to call each function and send the output back to the LLM using the respective id. We can do it this way:

function_list = {
    "get_appointment_status":get_appointment_status,
    "get_appointment_time":get_appointment_time
}

# If the tool_calls is not empty
if assistant_message.tool_calls:
    # For each tool_call, call the corresponding function and append it to the message
    for tool_call in assistant_message.tool_calls:
        # Extract the function name and the arguments
        function_name = tool_call.function.name
        function_args = json.loads(tool_call.function.arguments)

        # Get the actual function
        function_to_call = function_list[function_name]

        # Call the function with the parameter(s) and store the function response
        if function_name == "get_appointment_status":
            function_response = function_to_call(patient_id = function_args.get("patient_id"))
        if function_name == "get_appointment_time":
            function_response = function_to_call(patient_id = function_args.get("patient_id"), 
                                                 last_name = function_args.get("last_name"))
            
        # Update the messages list with corresponding tool_call_id
        messages.append({"tool_call_id": tool_call.id,
                 "role": "tool",
                 "name": function_name,
                 "content": function_response,
                })

# Get response
chat_response = get_chat_completion(messages, tools=functions_list)
assistant_message = chat_response.choices[0].message
messages.append(assistant_message)
print(assistant_message)

Finally, we obtained the desired output from the GPT

ChatCompletionMessage(
    content='The appointment status for patient ID 12345 is "Confirmed" and the appointment time is scheduled for 10:30 AM on 29th July.',
    refusal=None,
    role="assistant",
    audio=None,
    function_call=None,
    tool_calls=None,
)

Please try the above steps again for a different patient ID.

One last thing: sometimes we may need to force the GPT to use a particular function from the list of functions. We can do that by setting tool_choice with a specific function name like this:

chat_response = get_chat_completion(messages, tools=functions_list, tool_choice="get_appointment_time")

Conclusion

As a result, you are now familiar with the following:

Function calling with chat completions API allows models to identify when specific functions should be called and output structured JSON to control interaction between the LLM and external systems.
Functions are defined through JSON objects that specify the function's name, description, parameters, and required parameters, which helps the model understand when and how to trigger appropriate functions.
When a function call is triggered, the model provides an object containing the function name and arguments, which can then be executed to perform the actual function and return results back to the model for response generation.
The API supports parallel function calling by allowing multiple functions to be defined and executed simultaneously.

7 learners liked this piece of theory. 4 didn't like it. What about you?

Report a typo

Function calls with chat completions

What are function calls

Setting up a function

Calling a function

Preparing for parallel function calls

Calling parallel functions

Conclusion

Related topics