Skip to content
1.2Beginner6 min

The 5 Components of an AI Agent Explained

Blck Alpaca·
Definition

An AI Agent consists of five core components: an LLM core as the reasoning engine, Memory (short-term and long-term), Tools (APIs, MCP servers, code sandbox), a Planner for goal decomposition, and an Executor that runs tool calls and enforces guardrails. These components work together in an iterative loop – Perceive, Reason, Act, Observe – to autonomously pursue a given goal.

Key Takeaways

  • A fully fledged AI Agent combines five components: LLM core, Memory, Tools, Planner, and Executor – if one of them is missing, it is usually a chatbot, workflow, or assistant.
  • The LLM core is the reasoning engine: it dynamically decides on the next step and the tool choice – no deterministic code takes over the primary control.
  • The Perceive→Reason→Act→Observe loop (conceptual basis: ReAct, Yao et al. 2022) is the operating principle that iteratively connects the five components until the goal is reached or aborted.
  • Memory distinguishes between short-term (conversation context) and long-term (vector/RAG/files); Tools range from function calls and APIs to MCP servers and a code sandbox.
  • The maturity levels L1–L5 range from a rule-based reflex bot to a coordinated multi-agent system; true autonomy only begins at L4, when the LLM dynamically controls sequence and tool choice.
  • The Executor manages turns, loop limits, and guardrails – including human-in-the-loop for irreversible actions, which is relevant for DACH compliance.

What are the components of an AI Agent?

An AI Agent is a software-based system built on a (Large) Language Model that autonomously pursues a given goal: it perceives its environment, plans in multiple steps, independently selects and uses external tools, observes results, and iteratively adjusts its plan. Technically, this capability can be traced back to five core components that work together in a fixed cycle.

The three most important points up front:

  • Five components are mandatory. LLM core, Memory, Tools, Planner, and Executor must work together. If one of them is missing, it is usually a chatbot, a workflow, or an assistant/copilot – not an agent.
  • The LLM controls, not the code. The sequence of steps and the tool choice emerge dynamically at runtime. This is precisely what distinguishes an agent from a deterministic pipeline.
  • The loop holds everything together. Perceive → Reason → Act → Observe is the operating principle that connects the components and runs iteratively until the goal is reached or aborted.

The 5 core components in detail

1. LLM core (reasoning engine). The language model is the brain of the agent. It interprets the goal, plans the next step, and decides via function calling which tool is invoked. Anthropic draws a clean distinction here: with agents, the LLM dynamically controls the path and tool use, whereas workflows run through predefined code paths.

2. Memory. Agents need memory on two levels. Short-term memory is the conversation context of the current run (which steps have already been taken, which results are available). Long-term memory is typically implemented via vector databases, RAG, or files and provides knowledge beyond individual sessions.

3. Tools. Tools are the hands of the agent – the interface to the outside world. These include function calls, REST APIs, database access, a browser, a code sandbox, and increasingly MCP servers (Model Context Protocol). It is only through tools that an agent can actually act beyond pure text output.

4. Planner. The Planner decomposes the overarching goal into sub-steps. This happens either implicitly within the LLM (the model figures out the sequence itself) or explicitly as a graph or state machine, as enabled by frameworks like LangGraph.

5. Executor. The Executor runs the tool calls chosen by the LLM, manages the individual turns, and enforces safety boundaries: loop limits against infinite loops, guardrails, and – for irreversible actions – human-in-the-loop. It is the component that makes autonomy controllable.

The reasoning loop: Perceive → Reason → Act → Observe

The five components only take effect within the cycle:

  1. Perceive – The agent takes in input, goal, context, and memory.
  2. Reason – The LLM core plans: which step, which tool next?
  3. Act – The Executor runs the tool call, API call, or code.
  4. Observe – The result is read and written to memory.

After this, the agent checks: goal reached? If not, it goes back to Perceive. The conceptual foundation of this loop is the ReAct paradigm (Yao et al. 2022), which interleaves reasoning and acting.

Concrete example: research agent

An employee instructs an agent: "Find the three largest competitors in the DACH market and summarize their pricing models." Perceive: The agent reads the task. Reason: The LLM core decides to start with a web search first. Act: The Executor invokes the search tool. Observe: The hits land in memory. The LLM recognizes that detail pages are missing, invokes a browser tool in the next turn, extracts prices, writes them to memory – and finally creates the summary. No one programmed the sequence of steps in advance; the LLM determined it at runtime. This is precisely the difference from a rigid workflow.

Maturity levels L1–L5: from reflex to multi-agent system

Not every "agentic" system is equally autonomous. A five-level maturity scale helps with classification:

Level

Type

Characteristic

Example

L1

Reflex

Rule-based, no LLM control

FAQ bot

L2

Augmented LLM

LLM + a single tool call, reactive

LLM with search function

L3

Workflow agent

LLM in a deterministic pipeline (prompt chaining, routing)

Classified ticket flow

L4

Autonomous agent

LLM dynamically controls sequence + tool choice, full loop

Claude Code, Deep Research

L5

Multi-agent system

Multiple autonomous agents coordinate via A2A (orchestrator + specialists)

Coordinated specialist teams

Crucially: true goal-oriented autonomy only begins at L4. At L1–L3, the path is fully or partially predetermined – here a workflow or assistant is often the more honest and cheaper choice. At L5, multiple agents coordinate via the A2A protocol (with the Linux Foundation since June 2025, 150+ organizations), which brings additional risks such as compounding errors.

Maturity level and market reality

The architecture sounds powerful – but scaling remains demanding. According to McKinsey State of AI 2025, only 23% of companies are scaling at least one agentic use case, with a further 39% experimenting; in no function does the share of scaled agents exceed 10%. This underscores: more components and higher maturity levels also mean more maintenance effort (prompts, tools, evals, models) – a pilot should therefore start small, read-only, and with a clear ROI.

Conclusion

The five components – LLM core, Memory, Tools, Planner, Executor – and the Perceive→Reason→Act→Observe loop are the common foundation of every AI Agent. Those who understand them can expose "agent washing" and choose the appropriate maturity level for each use case, instead of building the most complex architecture on principle.

FAQ

Which five components does an AI Agent have?
LLM core (reasoning engine), Memory (short-term and long-term), Tools (APIs, MCP servers, code sandbox), Planner (goal decomposition), and Executor (tool execution, loop limits, guardrails). All five must work together – otherwise it is not a fully fledged agent.
What is the difference between short-term and long-term memory?
Short-term memory is the conversation context of the current run – that is, previous steps and intermediate results. Long-term memory stores knowledge beyond sessions, typically via vector databases, RAG, or files.
What does the Perceive→Reason→Act→Observe loop mean?
It is the operating principle of an agent: it perceives the goal and context (Perceive), plans the next step via the LLM (Reason), runs a tool call (Act), and reads the result (Observe). If the goal is not reached, the loop starts over. The conceptual basis is the ReAct paradigm (Yao et al. 2022).
At which maturity level is something a true agent?
True goal-oriented autonomy begins at L4: here the LLM dynamically controls sequence and tool choice in the full loop (e.g., Claude Code, Deep Research). L1–L3 (reflex bot, augmented LLM, workflow agent) follow a fully or partially predetermined path.
What is the Executor for?
The Executor runs the tool calls chosen by the LLM and makes autonomy controllable: it manages turns, sets loop limits against infinite loops, enforces guardrails, and brings a human into the loop for irreversible actions (human-in-the-loop).
Do I need a fully autonomous agent for every use case?
No. An agent is only worthwhile when the solution path cannot be planned in advance and an LLM decision is needed. For predictable processes, a workflow (L3) or an assistant/copilot is often cheaper and requires less maintenance.

Want to go deeper?

Get new analyses straight to your inbox – or see how we put this knowledge to work for companies.