Generative AIBuilding with foundation modelsOpenAI platformOpenAI Assistants API

Introduction to the Assistants API

13 minutes read

The Assistants API by OpenAI is an interface designed for building, customizing, and deploying AI-powered assistants within applications tailored to specific tasks and user needs. These AI assistants leverage advanced language models to interact naturally with users, provide information, automate workflows, and integrate with other applications.

This API is part of OpenAI’s broader platform, enabling seamless interaction with AI models while offering tools to manage assistant configurations, conversations, and user experiences.

Core Design

The Assistants API is OpenAI's solution for building AI assistants within your own applications. This API allows developers to create assistants with OpenAI's models, tools, and file capabilities to respond intelligently to user queries.

Some of the tools currently include the Code Interpreter, File Search, and Function Calling, with the possibility to extend functionality through custom integrations. This allows us to modularize the functionality through tool registration and execution.

Unlike the Chat Completions API, which is stateless and requires you to manage conversation history yourself, the Assistants API maintains state through Threads, making it easier to build conversational experiences. This stateful nature simplifies many aspects of AI application development, particularly for applications requiring persistent context.

NOTE: The Assistants API is currently in beta, with OpenAI planning a deprecation in the first half of 2026 as they transition functionality to the Responses API.

Key Components of the Assistants API

The Assistants API consists of three primary components:

  1. Assistants: Encapsulate a base model, instructions, tools, and context documents

  2. Threads: Represent the state of a conversation

  3. Runs: Power the execution of an Assistant on a Thread, including text responses and tool use

For users with a custom hosting URL, you'll need to configure the client with your API key and base URL:

from openai import OpenAI

# Initialize the client with your API key and custom base URL
client = OpenAI(
    api_key="your_api_key_here",
    base_url="<https://your-custom-endpoint.com/v1>"
)

# Alternatively, you can set environment variables
# export OPENAI_API_KEY=your_api_key_here
# export OPENAI_BASE_URL=https://your-custom-endpoint.com/v1

Basic Usage: Creating Your First Assistant

Let's create a simple assistant that can answer general questions. While going through the code, you can execute the parts of it separately for a better understanding, as shown below.

Step 1: Create an Assistant

# Create a new assistant
assistant = client.beta.assistants.create(
    name="General Knowledge Assistant",
    instructions="You are a helpful assistant that answers general knowledge questions.",
    model="gpt-4o",
)

print(f"Assistant created with ID: {assistant.id}")

# Output
# Assistant created with ID: asst_bvy9eF6ZQ6InsJ557Ey1FwXT

Step 2: Create a Thread

Threads store the message history between users and assistants, allowing for coherent multi-turn dialogues and context-aware responses. Each thread maintains the conversation state.

# Create a new thread
thread = client.beta.threads.create()
print(f"Thread created with ID: {thread.id}")

# Output:
# Thread created with ID: thread_fc8NwNLIiotKiyjuiZ1XMxbk

Step 3: Add a Message to the Thread

# Add a message to the thread
message = client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content="What are the major planets in our solar system?"
)

print(f"Message added to thread: {message.id}")

# Output:
# Message added to thread: msg_p7x07h7mZRQO0k8RHpUusbd2

Step 4: Run the Assistant on the Thread

# Run the assistant on the thread
run = client.beta.threads.runs.create(
    thread_id=thread.id,
    assistant_id=assistant.id
)

print(f"Run created with ID: {run.id}")

# Output:
# Run created with ID: run_qQlyfeLg4xS7ubShr0NQNlGb

Step 5: Check Run Status

While the user is waiting for the response from the API call, you can use the .status functionality of the run variable to display the progress of the call.

import time

# Wait for the run to complete
while run.status == "queued" or run.status == "in_progress":
    time.sleep(1)
    run = client.beta.threads.runs.retrieve(
        thread_id=thread.id,
        run_id=run.id
    )
    print(f"Run status: {run.status}")

Step 6: Retrieve Response

# Retrieve the messages from the thread
messages = client.beta.threads.messages.list(
    thread_id=thread.id
)

# Display the assistant's response
for message in messages.data:
    if message.role == "assistant":
        print(f"Assistant: {message.content[0].text.value}")

Expected Output:

# Output
Assistant: The major planets in our solar system, in order from the Sun, are:

1. **Mercury** - The closest planet to the Sun and the smallest in the solar system.
2. **Venus** - Similar in size to Earth but with a thick atmosphere and high surface temperatures.
3. **Earth** - The only planet known to support life, with a diverse environment and liquid water.
4. **Mars** - Known as the Red Planet, it has a thin atmosphere and surface features both similar to both the Moon and Earth.
5. **Jupiter** - The largest planet in the solar system, a gas giant with a prominent storm called the Great Red Spot.
6. **Saturn** - Famous for its stunning ring system and also a gas giant, second in size to Jupiter.
7. **Uranus** - An ice giant with a unique tilted axis that causes extreme seasonal changes.
8. **Neptune** - The farthest planet from the Sun, known for its deep blue color and strong winds.

These planets are divided into two main categories: terrestrial planets (Mercury, Venus, Earth, Mars) and gas giants/ice giants (Jupiter, Saturn, Uranus, Neptune).

Integrating Tools with Assistants

One of the most important features of the Assistants API is the ability to use tools. The API currently supports three types of tools: Code Interpreter, File Search, and Function Calling.

Let's create an assistant that can write and execute python code to solve problems. Make sure to import the packages and have the client declared as above:

# Create an assistant with code interpreter capability
code_assistant = client.beta.assistants.create(
    name="Python Coding Assistant",
    instructions="You are a helpful assistant that can write and execute Python code to solve problems.",
    model="gpt-4o",
    tools=[{"type": "code_interpreter"}]
)

# Create a thread for this assistant
thread = client.beta.threads.create()

# Ask a coding-related question
message = client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content="Can you create a simple python function to check if a number is prime?"
)

# Run the assistant
run = client.beta.threads.runs.create(
    thread_id=thread.id,
    assistant_id=code_assistant.id
)

# Wait for completion
while run.status != "completed":
    time.sleep(1)
    run = client.beta.threads.runs.retrieve(thread_id=thread.id, run_id=run.id)
    
    *# If the run requires action (e.g., for function execution), we'd handle that here*
    if run.status == "requires_action":
        *# Handle tool actions (not shown in this basic example)*
        pass

# Get the response
messages = client.beta.threads.messages.list(thread_id=thread.id)
for message in messages.data:
    if message.role == "assistant":
        print(f"Assistant: {message.content[0].text.value}")
# Output
Assistant: Here's the `is_prime` function again for your reference:

```python
def is_prime(n):
    """Check if a number is prime."""
    if n <= 1:
        return False
    for i in range(2, int(n**0.5) + 1):
        if n % i == 0:
            return False
    return True
```

This function returns `True` if a number is prime and `False` otherwise. It efficiently checks for primes by testing divisibility only up to the square root of the number. 

If you'd like to know more or have other requests, feel free to ask!
Assistant: The function `is_prime(n)` checks whether a given number \( n \) is prime. Here are the test results for various numbers:

- 1: False
- 2: True
- 3: True
- 4: False
- 5: True
- 16: False
- 17: True
- 18: False
- 19: True
- 20: False

If you have any specific numbers you'd like to test or need modifications to the function, let me know!

Managing Persistent Conversations

One of the key advantages of the Assistants API is its ability to maintain persistent conversation state through Threads. This makes it suitable for applications where context needs to be maintained over time.

# Create an assistant for our persistent conversation
conversation_assistant = client.beta.assistants.create(
    name="Conversation Assistant",
    instructions="You are a helpful assistant that remembers previous parts of the conversation.",
    model="gpt-4o"
)

# Create a thread that will persist throughout the conversation
persistent_thread = client.beta.threads.create()

# Function to send a message and get the response
def send_message_and_get_response(thread_id, assistant_id, message_content):
    # Add the user's message to the thread
    client.beta.threads.messages.create(
        thread_id=thread_id,
        role="user",
        content=message_content
    )
    
    # Run the assistant on the thread
    run = client.beta.threads.runs.create(
        thread_id=thread_id,
        assistant_id=assistant_id
    )
    
    # Wait for the run to complete
    while run.status == "queued" or run.status == "in_progress":
        time.sleep(1)
        run = client.beta.threads.runs.retrieve(
            thread_id=thread_id,
            run_id=run.id
        )
    
    # Get the latest message from the assistant
    messages = client.beta.threads.messages.list(
        thread_id=thread_id
    )
    
    # Return the assistant's response
    for message in messages.data:
        if message.role == "assistant":
            return message.content[0].text.value
    
    return "No response received."

# First message in the conversation
response1 = send_message_and_get_response(
    persistent_thread.id,
    conversation_assistant.id,
    "My name is Alex and I'm learning about AI."
)
print(f"Response 1: {response1}")

# Second message, building on the context of the first
response2 = send_message_and_get_response(
    persistent_thread.id,
    conversation_assistant.id,
    "What was my name again? And what am I learning about?"
)
print(f"Response 2: {response2}")

Expected Output:

Response 1: Hi Alex! It's great to meet you. Learning about AI is an exciting journey. Is there anything specific about AI that you're interested in exploring? I'd be happy to help you understand concepts, discuss recent developments, or point you toward useful resources.

Response 2: Your name is Alex, and you're learning about AI (Artificial Intelligence). Is there a particular aspect of AI you're most interested in learning about? I'm here to help with any questions you might have on the subject.

Function calling allows your assistant to call external functions that you define. Let’s give the functionality of the Open Weather API to our agent. You need an API KEY from the provider, which you can get from the dashboard once you create an account.

Import the below packages if not installed already, you can install them via pip as given:

pip install requests
import requests
import json
import time

Now, let’s implement the functionality of fetching the weather from the API as below.

NOTE: The Implementation might differ for different tool services, so read their documentation.

# --- Weather Tool Implementation ---
def get_weather(location: str, unit: str = "fahrenheit"):
		# It's the API Endpoint from the provider
    url = "https://api.openweathermap.org/data/2.5/weather"
    params = {        
        "q": location,        
        "appid": "WEATHER_API_KEY",
        "units": "imperial" if unit == "fahrenheit" else "metric"
    }
    
    try:
        response = requests.get(url, params=params)
        response.raise_for_status()
        data = response.json()
        return {
            "temperature": data["main"]["temp"],
            "unit": unit,
            "description": data["weather"][0]["description"],
            "humidity": data["main"]["humidity"],
            "location": location
        }
    except Exception as e:
         return {"error": str(e)}

Create the client with your credentials. But if you are using the organisation’s proxy endpoints, make sure the endpoint supports the extension of /openai at the end of the base_url. For example, the endpoint might look like this: https://organisation-proxy/openai . Else, you can use the same snippet to create the client.

For the next part, we should define the parameters like type, properties, etc, in JSON, which is a standard style for implementation. The Definition of the tool can vary based on the provider.

tools = [
    {"type": "code_interpreter"},
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather in a given location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string"},
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
                },
                "required": ["location"]
            }
        }
    }
]

Define the agent with some instructions and tools. Create the thread and run the thread as below:

assistant = client.beta.assistants.create(
    name="Python Coding & Weather Assistant",
    instructions="You are a helpful assistant that can write and execute Python code and fetch weather.",
    model="gpt-4o-mini",
    tools=tools
)

thread = client.beta.threads.create()
client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content="What's the weather in Paris in celsius?"
)

run = client.beta.threads.runs.create(
    thread_id=thread.id,
    assistant_id=assistant.id
)

Tool calls, along with their arguments, get loaded to the client, which gives the extension of those services to the agent:

while True:
    run = client.beta.threads.runs.retrieve(thread_id=thread.id, run_id=run.id)
    if run.status == "completed":
        break
    elif run.status == "requires_action":
        tool_calls = run.required_action.submit_tool_outputs.tool_calls
        tool_outputs = []
        for tool_call in tool_calls:
            if tool_call.function.name == "get_weather":
                args = json.loads(tool_call.function.arguments)
                weather_data = get_weather(
                    location=args.get("location"),
                    unit=args.get("unit", "fahrenheit")
                )
                tool_outputs.append({
                    "tool_call_id": tool_call.id,
                    "output": json.dumps(weather_data)
                })
        client.beta.threads.runs.submit_tool_outputs(
            thread_id=thread.id,
            run_id=run.id,
            tool_outputs=tool_outputs
        )
    else:
        time.sleep(1)

The Final Part, now you can retrieve the responses from the agent as below:

messages = client.beta.threads.messages.list(thread_id=thread.id)
for message in messages.data:
    if message.role == "assistant":
        print(f"Assistant: {message.content[0].text.value}")

Expected Output:

Assistant: The current weather in Paris is as follows:

- **Temperature:** 22.79°C
- **Description:** Broken clouds
- **Humidity:** 34%

Best Practices for Using the Assistants API

When working with the Assistants API, consider the following best practices:

  1. Clear Instructions: Provide clear, specific instructions to your Assistant to guide its behavior and responses.

  2. Appropriate Tool Selection: Only enable the tools your assistant needs for its intended function.

  3. Error Handling: Implement error handling, especially for function calls that might fail.

  4. Thread Management: Consider how long you need to persist threads and implement appropriate cleanup strategies.

  5. Response Processing: Parse and validate assistant responses, especially when they include tool outputs or file references.

Conclusion

Let's recall some of the concepts that we have learned so far. We learned about the Assistants API, its key components, and how to set up the environment. We have also discussed configuration scenarios for official and proxy endpoints. We built our first assistant by learning what threads are, how to append the messages to it, and run the assistant on those threads. Using the threads, we even retrieved our messages from the history.

Next, we went on to learn function calling, defining them in JSON, and managing the conversations in which we integrated the weather API tool service with some of the best practices.

3 learners liked this piece of theory. 0 didn't like it. What about you?
Report a typo