Skip to content
5.3Intermediate6 min

The Supervisor Pattern: Making One Agent the Boss

Blck Alpaca·
Definition

The supervisor pattern is a multi-agent architecture in which a central supervisor agent decides which specialised sub-agent acts next. The supervisor receives the request, routes it to suitable worker agents, gathers their results and steers the flow until the task is solved. It does not reason about the domain itself but delegates and coordinates.

Key Takeaways

  • A supervisor agent routes between specialised sub-agents and decides after each step who acts next - the sub-agents do not communicate directly with one another.
  • AWS Bedrock AgentCore (Supervisor + Collaborator, GA March 2025) and LangGraph 1.0 (hierarchical supervisor template, 22 October 2025) are the production-ready reference implementations as of 2026.
  • The supervisor pattern suits parallelisable, loosely coupled tasks with a clear domain decomposition; for tightly coupled write workflows (e.g. code) it becomes fragile.
  • The central limitation is the supervisor itself: single point of failure, latency bottleneck and plan drift on long runs.
  • Allianz Project Nemo (seven agents, 80% reduction in handling and settlement time for eligible claims under AUD 500, human-in-the-loop) is the best-documented DACH reference for a hierarchical supervisor system.

The supervisor pattern is a multi-agent architecture in which a central supervisor agent decides which specialised sub-agent acts next. The supervisor receives the request, routes it to suitable worker agents, gathers their results and steers the flow until the task is solved. It does not reason about the domain itself but delegates and coordinates. This literally makes one agent the boss over a team of specialists.

Quick answers:

  • What does the supervisor do? It routes between sub-agents and decides after each step who is up next - the sub-agents do not talk to one another directly.
  • What do you build it with? As of 2026, with LangGraph 1.0 (open source) or AWS Bedrock AgentCore (Supervisor + Collaborator); A2A for cross-vendor, MCP for tools.
  • When does it fit? For parallelisable, loosely coupled tasks with a clear domain decomposition - not for tightly coupled write workflows such as code generation.

How the supervisor pattern works

The basic idea is a star-shaped topology (hub-and-spoke): the supervisor sits in the centre, the specialised worker agents around the outside. Each worker has its own prompt, its own tool set and often its own context window. The decisive point: the workers do not communicate with one another but exclusively via the supervisor. After each intermediate result, the supervisor decides afresh which agent should act next, or whether the task is complete.

A typical run follows this pattern:

  1. The supervisor receives the user request and classifies what kind of task it is.
  2. It selects a suitable specialist agent and hands it a clearly delineated assignment.
  3. The worker completes its subtask and returns a result - ideally a compressed one.
  4. The supervisor evaluates the result and decides: next worker, repetition or final synthesis.

This central routing logic distinguishes the supervisor pattern from a swarm (peer agents without a fixed coordinator) and from a pipeline (fixed, deterministic order). The supervisor brings control and - important for DACH B2B - a clear line of accountability for the question "Who decided what?".

Distinction: supervisor vs. orchestrator-worker

In practice the terms are often used synonymously, and that is not wrong - but there is a difference in emphasis that matters for the architectural decision.

In the orchestrator-worker pattern (the canonical multi-agent pattern per Anthropic's taxonomy), a lead agent decomposes the task and dynamically spawns sub-agents - as many as the plan currently requires. The reference case is the Anthropic Research Agent (June 2025): a lead Claude Opus 4 runs in extended-thinking mode, decomposes the request and launches N parallel Sonnet 4 sub-agents via the Task tool. Each sub-agent has its own context window (around 200k tokens), researches independently and returns a compressed summary - not the full transcript. Anthropic's internal evaluation: +90.2% on research breadth metrics compared with a single-agent Opus 4, albeit at roughly 15x token consumption.

The supervisor pattern places stronger emphasis on the routing element: the supervisor decides which of a set of fixed specialist agents acts next. AWS uses the label "Multi-Agent Collaboration" for its supervisor model - at its core this is a hub-and-spoke supervisor pattern, nothing more and nothing less. For three-tier variants (supervisor → planner → worker) the term is hierarchical multi-agent: the supervisor decides what kind of job it is, a planner builds the sub-task graph, and the workers execute it.

Characteristic

Supervisor pattern

Orchestrator-worker

Control

Central router decides per step

Lead plans and delegates

Sub-agents

Mostly fixed specialists

Dynamically spawned as needed

Typical task

Routing across known domains

Breadth research, fan-out

Reference as of 2026

AWS Bedrock AgentCore, LangGraph

Anthropic Research Agent

Token factor vs. single agent

approx. 3-20x (depending on hierarchy depth)

approx. 5-15x

The practical rule: every orchestrator is a supervisor, but not every supervisor spawns dynamically. For most enterprise workflows with known domains, the static supervisor variant is easier to evaluate and to audit.

Implementation: LangGraph and AWS Bedrock

As of 2026 there are two production-ready reference paths.

LangGraph (LangChain). The graph-based orchestration runtime reached the first stable major version in the field of durable agent frameworks with LangGraph 1.0 on 22 October 2025. It offers a hierarchical supervisor template, durable state with checkpointing (an interrupted run resumes from the last checkpoint after a server restart), human-in-the-loop pause/resume and streamable HTTP support for remote MCP servers. Licence: MIT. In 2026, LangGraph is the most common open-source orchestration runtime in DACH AI engineering teams.

Simplified pseudocode for a LangGraph supervisor:

```
supervisor = create_supervisor(
agents=[research_agent, contract_agent, approval_agent],
model="reasoning-model",
prompt="Route the request to the appropriate specialist. "
"Clear assignment per agent: goal, output format, boundaries. "
"Finish when the task is fully solved."
)

State flows as a typed state object through the nodes,

is checkpointed and can be restored after a restart.

```

AWS Bedrock AgentCore Multi-Agent Collaboration. GA since March 2025. The model consists of a supervisor agent plus collaborator agents, with inline agent runtime, an optimised "supervisor-with-routing" mode (faster pure routing) versus a full orchestration mode, as well as an integrated trace and debug console. AgentCore provides runtime, gateway, memory, identity and observability, and supports MCP.

For cross-agent communication - especially across vendor boundaries - you use the A2A protocol (Linux Foundation since June 2025). A2A defines a task lifecycle (submitted → working → input-required → completed | failed | canceled) in which the internal logic of the remote agent remains opaque - the supervisor sees only the assignment and the result. For the tool access of the individual agents, MCP is the standard. The vendors' rule of thumb is: MCP for agent-to-tool, A2A for agent-to-agent.

When it fits - and when it doesn't

The supervisor pattern plays to its strengths with parallelisable problems that have low state coupling between the subtasks, and when the write path is single-threaded or trivially mergeable. Good candidates: research, broad search, document fan-out, multi-document review, claims triage, fraud checking with independent checks.

It fails for tightly coupled, sequential tasks where every step shapes the implicit decisions of all the following ones. Cognition.ai showed in 2025 ("Don't Build Multi-Agents") that parallel writer swarms make incompatible style and edge-case decisions due to context fragmentation and produce non-mergeable diffs. The clean counter-example remains code generation across multiple files. Cognition's updated conclusion (April 2026) is pragmatic: multi-agent works for read-mostly fan-out, provided the write operations stay single-threaded.

Limitations: the supervisor as bottleneck

The central weakness is in the name: everything runs through the supervisor.

  • Single point of failure. If the supervisor fails or gets lost, the whole system stalls.
  • Latency bottleneck. Steps pass serially through the supervisor; with three hierarchy levels the latency cascades and can eat up the added value.
  • Plan drift. On long runs the supervisor loses track of its plan and re-spawns sub-tasks that have already been completed.
  • Vague delegation. Unclear assignments lead to duplicated work and gaps - Anthropic observed exactly this in early experiments. The remedy: each worker receives a goal, an output format, tool hints and clear task boundaries.
  • Prompt-injection amplification. Each new sub-agent context window is a new attack surface via untrusted content.

Example: Allianz Project Nemo

The best-documented DACH reference is Allianz Project Nemo - a hierarchical supervisor system for food-spoilage claims following natural disasters. Seven specialised agents work together: Planner, Cyber, Coverage, Weather, Fraud, Payout and Audit. The complete seven-agent workflow runs through in under five minutes; a human claims handler reviews the audit summary and makes the final payout decision - human-in-the-loop is explicit policy here.

The numbers: launched in Australia in July 2025, deployed in under 100 days, with an 80% reduction in handling and settlement time for eligible claims under AUD 500. Allianz is a German insurer and is evaluating the replication of the modular architecture across its global P&C portfolio.

Architecturally instructive is the dedicated audit agent: it produces a complete summary of all agent decisions and thus a gap-free audit trail for compliance and human review. The lesson for DACH: auditability belongs built into the agent topology, not just into the logging pipeline.

For agencies and B2B decision-makers

Anyone starting a multi-agent project in the DACH mid-market does well to go with LangGraph or n8n as the orchestration, MCP for tools and A2A for every cross-platform handshake. For corporations with a dominant SaaS landscape, the platform-native orchestration (Agentforce, SAP Joule Studio, Copilot Studio) is the obvious foundation, with A2A as the connecting link. Important for agencies advising clients: start with one clearly delineated, parallelisable use case, keep the write path single-threaded, build human-in-the-loop into the critical points, and bear in mind the longer DPA chain under GDPR Art. 28 as well as transfer impact assessments per cross-border A2A hop. The Allianz Nemo benchmark (under 100 days to live) is achievable for tightly scoped projects.

FAQ

What is the difference between the supervisor pattern and orchestrator-worker?
Both have a central, steering agent. In orchestrator-worker, a lead agent decomposes the task and dynamically spawns sub-agents with their own context windows that work in parallel and return compressed results (Anthropic Research Agent). The supervisor pattern more narrowly describes the routing element: after each step, the supervisor decides which fixed specialist agent acts next. In practice the two terms overlap heavily; AWS even uses the label Multi-Agent Collaboration for its supervisor model.
What can I use to implement a supervisor pattern?
As of 2026, the two production-ready paths are LangGraph 1.0 (open source, MIT, hierarchical supervisor template, durable state, since 22 October 2025) for your own stacks, and AWS Bedrock AgentCore Multi-Agent Collaboration (Supervisor + Collaborator, GA March 2025) for AWS environments. For cross-vendor handshakes you use the A2A protocol, and for tool access MCP.
Why does the supervisor become the bottleneck?
Every decision runs through the supervisor: it routes, waits for worker results, then decides again. This makes it a single point of failure and a latency bottleneck, because steps pass through it serially. On long runs, plan drift is added - the supervisor loses track of its plan and re-invokes sub-tasks that have already been completed.
When should I not use the supervisor pattern?
For tightly coupled, sequential write tasks where every step shapes the decisions of all following steps - the classic example is code generation across multiple files. Cognition.ai showed in 2025 that parallel writer swarms make incompatible decisions due to context fragmentation. For short, clearly defined workflows, a single agent with tools or a deterministic pipeline is also cheaper and easier to debug.
Is the supervisor pattern suitable for regulated DACH industries?
Yes, provided that auditability and human-in-the-loop are built into the agent topology. Allianz Project Nemo uses a dedicated audit agent that summarises all agent decisions and produces a complete audit trail; the final decision remains with a human. Points to consider are the longer DPA chain under GDPR Art. 28 and - for cross-border A2A hops - transfer impact assessments per border crossing.

Want to go deeper?

Get new analyses straight to your inbox – or see how we put this knowledge to work for companies.