Generative AIBuilding with foundation modelsBuilding agentsMulti-agent systems

Orchestration patterns for multi-agent systems

11 minutes read

We've explored why multi-agent systems matter and how agents can collaborate through delegation, teamwork, and debate. But there's a crucial architectural question we haven't fully addressed yet: how do you actually structure and control these collaborations?

Orchestration defines how agents coordinate and who controls workflow. It addresses both logical flow (how agents decide what happens next) and architectural structure (where control lives in the system). The core questions are: Who decides which agent runs next? How does information flow? How do you ensure effective coordination rather than conflict?

Two orchestration approaches

Before we dive into specific orchestration patterns, we need to understand a fundamental choice. Should you let the LLM reason about what to do next, or should your code explicitly control the flow? This distinction has profound implications for how your system behaves.

LLM-driven orchestration: Let the LLM decide workflow dynamically based on reasoning.

# Agent autonomously plans approach
research_agent = Agent(
    instructions="Research competitive landscape and write report",
    tools=[web_search, analyze_data, write_document],
    handoffs=[specialist_agents]
)

# Agent decides: search competitors → analyze each → compare → write
# Adapts if it finds 10 competitors instead of 3

Pros: Flexible, adapts to open-ended tasks, handles unexpected situations

Cons: Less predictable, potentially higher cost, harder to debug

Code-driven orchestration: Your code explicitly controls the flow.

# You define exact sequence
def research_pipeline(topic):
    sources = search_agent.run(topic)
    analysis = analysis_agent.run(sources)
    report = writing_agent.run(analysis)
    return report

Pros: Predictable, controllable costs, easier debugging

Cons: Rigid, can't adapt to variations

The tradeoff is fundamental. LLM-driven orchestration gives you an agent that can think creatively about how to solve problems and adapt when things don't go as planned. Code-driven orchestration gives you deterministic behavior and makes it easy to reason about costs and performance, but you lose the ability to handle unexpected variations gracefully.

In practice: Mix both. Use code for high-level phases, LLM for details within each phase.

The optimal approach is usually hybrid. You might use code to orchestrate the major phases of a research project (gathering sources, analysis, writing) while letting LLM agents figure out the detailed steps within each phase. Or you might have an LLM planner agent create a task breakdown, then use code to execute that plan reliably. The key is choosing the right level of abstraction for each type of decision.

Centralized orchestration

In centralized orchestration, a single coordinator or manager agent acts as the central authority directing all workflow. Every agent looks to this central coordinator to understand what they should do next, and all coordination flows through this single point of control. This creates clear lines of authority and makes it easy to understand how your system works, but it also creates a potential bottleneck and single point of failure.

Sequential orchestration

Fixed pipeline where manager directs step-by-step:

class SequentialOrchestrator:
    def __init__(self, specialists):
        self.specialists = specialists  # [research, outline, write, edit, seo]

    def execute(self, task):
        result = task
        for specialist in self.specialists:
            result = specialist.run(result)
        return result

# Blog post pipeline
pipeline = SequentialOrchestrator([
    research_agent,   # Gather sources
    outline_agent,    # Create structure
    writing_agent,    # Write draft
    editing_agent,    # Refine language
    seo_agent        # Optimize metadata
])

blog_post = pipeline.execute("Write about AI in healthcare")

When to use: Clear dependencies, each step builds on previous, predictable workflow.

Sequential orchestration is the simplest form of centralized control. The blog post pipeline above makes perfect sense as a sequence: you can't write before you have an outline, and you can't create an outline before you have research. The dependencies are real, so enforcing them through sequential orchestration is natural rather than artificial.

The main limitation is that it can't take advantage of parallelism. For many workflows, these limitations don't matter, and the clarity of sequential orchestration makes it the right choice.

Hierarchical orchestration

As your system grows beyond a handful of agents, flat sequential orchestration starts to strain. Managing fifty specialists with one central orchestrator becomes unwieldy. Hierarchical orchestration mirrors how large organizations work, with layers of management that each handle coordination at their level of scope.

# Top-level orchestrator
enterprise_system = Agent(
    instructions="Process customer orders",
    handoffs=[validation_manager, fulfillment_manager]
)

# Department managers
validation_manager = Agent(
    instructions="Coordinate order validation",
    handoffs=[inventory_check, payment_verify, address_confirm]
)

fulfillment_manager = Agent(
    instructions="Coordinate order fulfillment",
    handoffs=[packing_agent, shipping_agent]
)

Scalability becomes much more natural. If you need to add a new validation check, say fraud detection, you simply add it to the validation manager's handoff list. The enterprise system doesn't need to know about it. This isolation dramatically reduces the risk that changes in one part of your system will break something seemingly unrelated in another part.

The main challenge is latency. Each level adds overhead when information must flow sequentially through layers, though this can be mitigated if levels operate in parallel. In our order processing example, information flows from the user to the enterprise system, then to the validation manager, then to the inventory check specialist, then back up through the layers. Each transition involves an LLM call, which means multiple round trips. For time-sensitive applications, this latency can become prohibitive.

When to use: Large systems with clear departmental boundaries, when you're scaling to dozens of agents, when your domain naturally matches organizational structure, or when different parts of your system are maintained by different teams.

Magentic orchestration

The third centralized pattern takes a fundamentally different approach from both sequential and hierarchical orchestration. Rather than following a fixed sequence or predetermined hierarchy, magentic orchestration gives the manager agent intelligence to dynamically plan and adapt its approach based on what it discovers during execution. The term "magentic" comes from "manager-agent," emphasizing that this is still centralized control, but with the manager acting more like a strategic planner than a rigid coordinator.

class MagenticOrchestrator:
    def __init__(self, manager, specialist_pool):
        self.manager = manager
        self.specialists = specialist_pool

    def execute(self, task):
        # Manager creates initial plan
        plan = self.manager.create_plan(task)
        results = {}

        while not plan.is_complete():
            # Get next subtask
            subtask = plan.get_next_task()

            # Manager selects appropriate specialist
            specialist = self.manager.select_specialist(subtask)

            # Execute subtask
            result = specialist.run(subtask)
            results[subtask.id] = result

            # Manager updates plan based on results
            plan = self.manager.update_plan(plan, result)

        return self.manager.synthesize(results)

The manager continuously plans, delegates, monitors, and replans based on discoveries, creating an adaptive system for the unexpected. This flexibility excels for open-ended research or exploratory analysis where the right approach emerges through discovery.

The tradeoff is unpredictability. Execution depends on discoveries. One query might need three specialists and thirty seconds, another eight specialists and five minutes. Debugging requires understanding why the manager chose each path. You also need termination conditions to prevent infinite loops.

When to use: Open-ended tasks, incomplete initial understanding, research where findings open new questions, unpredictable complex analysis, or when adaptation matters more than predictability.

Decentralized orchestration

Centralized orchestration provides clarity and control through a single authority directing workflow. But some problems don't fit this model. What if no agent has enough context to direct the entire workflow? What if the right next step depends on discoveries made during execution? What if multiple independent systems need to coordinate without surrendering autonomy?

This is where decentralized orchestration becomes valuable. Rather than having a central manager, control is distributed across agents that coordinate through interaction.

Group chat orchestration

In group chat orchestration, multiple agents participate in a shared conversation as peers, with each agent contributing based on its expertise and perspective. Think of it as a team meeting where specialists discuss a problem together, building on each other's contributions to reach a conclusion.

The key difference from centralized patterns is that no manager agent directs the flow. Instead, agents take turns speaking, responding to what others have said, and collectively moving toward a solution.

class GroupChatOrchestrator:
    def __init__(self, participants, max_turns=20):
        self.participants = participants
        self.conversation = []
        self.max_turns = max_turns

    def select_next_speaker(self):
        # Selection strategies: round-robin, LLM-based, or self-nomination
        selector_prompt = """Given conversation, who should speak
        next to move discussion forward?"""
        return self.llm_select(self.conversation, self.participants)

    def execute(self, task):
        self.conversation.append({"role": "user", "content": task})

        for turn in range(self.max_turns):
            speaker = self.select_next_speaker()
            message = speaker.generate(self.conversation)
            self.conversation.append({"role": speaker.name,
                                     "content": message})

            if self.has_reached_conclusion():
                break

        return self.synthesize_outcome()

The critical design decision is how you select the next speaker:

Round-robin selection: Rotate through agents in fixed order. Guarantees equal participation and predictable cost, but rigid and inefficient.
LLM-based selection: Use an additional LLM call to decide who should speak next. Creates natural dynamics but increases cost and risks dominance by some agents.
Agent self-nomination: Each agent decides whether they want to speak, then randomly select from those who volunteer. Balances natural dynamics with reasonable cost.

The conversation continues until you reach a termination condition: maximum turn count to prevent infinite discussions, LLM-based check determining the problem is solved, or explicit consensus where all agents agree the discussion is complete.

The challenge with group chat orchestration is managing conversation quality. Without a moderator, discussions can become repetitive, circular, or fail to converge. Another practical consideration is cost. A twenty-turn group chat with four participants and LLM-based selection means forty LLM calls, which accumulates rapidly for complex discussions.

When to use: Design discussions where multiple perspectives genuinely matter, problems that benefit from debate and synthesis rather than decomposition, brainstorming and creative tasks, situations where no single agent has enough context to solve the problem alone, or when you want specialists to challenge each other's assumptions and find better solutions through dialogue.

Handoff orchestration

Handoff orchestration represents a middle ground between centralized control and group chat collaboration. Rather than having a central manager directing everything or having all agents participate in a shared discussion, handoff orchestration creates a network where agents can dynamically pass work to each other based on their assessment of what's needed next.

class HandoffOrchestrator:
    def __init__(self, entry_agent, agent_network):
        self.entry_agent = entry_agent
        self.network = agent_network

    def execute(self, task):
        current_agent = self.entry_agent
        conversation = [{"role": "user", "content": task}]

        for _ in range(max_handoffs):
            result = current_agent.process(conversation)

            if result.is_complete:
                return result.output

            if result.handoff_to:
                next_agent = self.network.get(result.handoff_to)
                conversation.append({
                    "role": current_agent.name,
                    "content": f"Handoff: {result.context}"
                })
                current_agent = next_agent

Agents in a handoff system make three types of decisions at each step. First, can I complete this task myself? If yes, do the work and return results. If no, proceed to the next decision. Second, what specialist would be best suited to handle this next? This requires understanding both the current state and the capabilities of other agents in the network. Third, what context does that specialist need to work effectively? Not everything in the conversation history may be relevant, so good handoffs include curated context.

Handoff orchestration shares similarities with group chat: both handle unexpected situations well and require agents to understand other agents' capabilities. However, handoff orchestration offers a crucial efficiency advantage. If agents are skilled at selecting the right next agent and passing only relevant context, you avoid running every agent on every conversation turn. Only the agents actually needed for the task get invoked.

When to use: Customer service where issues require contextual routing between departments, technical support where problems often reveal unexpected complications, consultation services where one specialist's findings determine which specialist should be engaged next, or any scenario where the right expert depends on what you discover along the way rather than what you know at the start.

Federated orchestration

Federated orchestration tackles a unique challenge that none of the other patterns fully address: how do you coordinate agents when they operate across organizational or jurisdictional boundaries? When different entities need to maintain local control and autonomy while still participating in a larger coordinated system, federated orchestration provides the architectural pattern.

The basic structure involves multiple autonomous systems, each internally centralized using any of the orchestration patterns we've discussed, but which coordinate with each other through a federation protocol:

Hospital A System (centralized internally)
├─ Validation Manager
├─ Scheduling Manager
└─ Billing Manager
    ↕ Federation Protocol
Hospital B System (centralized internally)
├─ Validation Manager
├─ Scheduling Manager
└─ Billing Manager

Let's explore a concrete healthcare example. Imagine a patient needs care involving multiple healthcare providers: their primary care physician, a specialist at a different hospital, and a pharmacy. Each provider has their own multi-agent system for managing appointments, medical records, billing, and so on. These systems were built independently, use different technologies, and are maintained by different organizations.

Without federation, coordination is manual. The patient tells their primary care doctor about the specialist visit. The doctor's office calls the specialist to get records. The specialist manually sends prescriptions to the pharmacy. Information moves slowly, through human intermediaries, with many opportunities for error or miscommunication.

With federated orchestration, these independent systems can coordinate while respecting organizational boundaries. When the primary care doctor's system schedules a specialist consultation, it sends a structured message through the federation protocol to the specialist's system: "Patient needs cardiology consultation. Here's the relevant medical history and the reason for referral." The specialist's system receives this message, processes it internally using its own agents, and schedules an appointment. It then sends a response back: "Consultation scheduled for March 15. We'll need these additional records before the visit."

The key insight is that each organization's internal system remains private and autonomous. Hospital A doesn't get to see or control Hospital B's internal agents. Hospital B doesn't need to know how Hospital A structures its workflows. They only interact through the federation protocol, which defines what messages they can exchange and what those messages mean.

This separation of concerns provides several critical benefits:

First, regulatory compliance becomes manageable. HIPAA in the United States, GDPR in Europe, and other data protection regulations often restrict what information can be shared and how.
Second, organizational autonomy is preserved. Each healthcare provider can update their internal systems, change their agent architectures, or adopt new technologies without coordinating with every other participant in the federation.
Third, trust boundaries are explicit. In centralized orchestration, all agents implicitly trust each other because they're part of the same system. In federation, trust relationships are explicit and limited. Hospital A trusts that when Hospital B says "appointment confirmed," that's accurate, but Hospital A doesn't grant Hospital B access to its internal patient database or billing systems.

Implementing federated orchestration is significantly more complex than other patterns. You're essentially building distributed systems with all the challenges that entails: network partitions, message ordering, eventual consistency, and failure handling. Federated orchestration is not a choice you make for convenience or elegance, but rather because your domain requires it. Use it only when organizational boundaries, regulatory requirements, or trust constraints make other patterns infeasible.

When to use: Cross-organizational collaboration where each entity maintains control of their internal systems, highly regulated industries like healthcare or finance where data governance requires clear boundaries, situations where no single entity has authority over the entire system, or when you need to connect independently developed and operated agent systems that were never designed to work together but now need to coordinate.

Choosing the right pattern

Pattern	Best for	Example
Sequential	Fixed workflow, clear dependencies	Content pipelines, data processing
Hierarchical	Large systems, organizational structure	Enterprise automation, multi-department
Magentic	Open-ended, adaptive planning	Research projects, complex analysis
Group Chat	Multiple perspectives, consensus	Design discussions, problem-solving
Handoff	Contextual routing, flexible	Customer service, technical support
Federated	Cross-boundary, regulated	Healthcare, financial services

Conclusion

Orchestration patterns define how your multi-agent system operates. By choosing patterns that match your problem structure and combining them thoughtfully, you can build systems where specialized agents work together effectively to solve complex problems.

Control location matters – Centralized provides clarity, decentralized provides flexibility
LLM vs code tradeoff – Flexibility vs predictability
Match pattern to problem – Sequential for pipelines, group chat for open problems
Patterns compose – Layer multiple approaches in sophisticated systems
Observability is critical – Distributed systems need comprehensive monitoring
Start simple – Most problems don't need the most complex patterns

The journey from single agents to multi-agent systems represents a fundamental evolution in LLM application architecture. With the right collaboration patterns and orchestration, you can tackle problems that would overwhelm any single agent, no matter how sophisticated.

How did you like the theory?

Report a typo