The Observe phase is the simplest part of the agent loop but plays a vital role. After an agent executes an action (calls a function, queries a database, fetches data), it needs to observe the results and add them to its context for the next reasoning step. Think of it as the agent's sensory system – it "sees" what happened after taking an action and uses that information to decide what to do next.
Core components
At its core, observation involves several key steps:
Collecting outputs – The agent receives results from executed tools or functions. This could include raw data, JSON objects, API responses, or error messages.
Formatting results – Outputs are structured and prepared so the agent can easily process them in the next reasoning step.
Adding results to context – The agent incorporates these observations into its working memory or conversation history. This ensures all relevant information is available for subsequent reasoning.
Preparing for the next Thought phase – By updating its context, the agent is ready to analyze, plan, and decide on the next action based on the latest information.
The Observe phase transforms raw action results into usable knowledge, allowing the agent to make informed decisions in the next iteration of the loop. Even though it may appear simple, this phase is essential for maintaining continuity, accuracy, and adaptability in the agent’s reasoning process.
The observation process
Here's what happens during observation:
Agent executes action → Function returns result → Agent observes output → Output added to context → Loop continuesFor example, if an agent calls a weather API:
Action: Call
get_weather(city="London")Observation: Receive
{"temperature": 15, "condition": "cloudy", "humidity": 75}Context update: Add this result to the conversation history
Next thought: Agent reasons about what to do with this weather data
The observation itself doesn't need complex logic – it mainly focuses on data collection and context management. The agent takes the function's output (whether it's a JSON object, string, number, or error message) and makes it available for the next reasoning cycle.
Context selection
While observation seems straightforward for simple agents with 3-4 functions, a major challenge emerges as systems grow more complex. What context should be passed to the next LLM request?
This problem becomes critical when:
Your agent has dozens of available functions
You're passing large documents into the agent
Multiple observations build up over several iterations
Function outputs contain verbose or redundant information
Consider an agent that has made 10 function calls, each returning different amounts of data. By the 11th iteration, your context might contain:
The original user query
All previous thoughts and reasoning steps
Ten function call results (some large, some small)
Function definitions for dozens of available tools
This creates a context management problem. LLMs have token limits, and more context means:
Higher latency (more tokens to process)
Increased costs (more tokens = more expensive)
Potential performance degradation (too much irrelevant context can confuse the model)
The question becomes: should you pass everything to the next reasoning step? Only recent observations? A summary of previous observations? Just the most relevant parts? For simple agents with limited function sets and short conversations, context management isn't a concern – you can safely pass everything to each iteration. However, as your agentic systems grow, observation changes from a simple step into a strategic bottleneck.
How do we solve this? What strategies exist for intelligent context selection? Why does too much context hurt LLM performance? We'll explore these questions and practical solutions in dedicated topics.
Conclusion
Observation is how an agent collects function outputs and feeds them back into the loop. While it seems simple – gather results, add to context, continue – it becomes a complex challenge when handling multi-function agents and large data volumes.