Skip to content
1.9Beginner7 min

The History of AI Agents

Blck Alpaca·
Definition

The history of AI Agents stretches from the classic agent concept in AI research (a system that perceives its environment and reacts to it with actions), through the ReAct pattern (2022), which combines reasoning and acting in the same LLM loop, all the way to today's frontier LLM agents. Open standards such as MCP (2025) and A2A (2025) connect agents to tools and to one another.

Key Takeaways

  • The classic agent concept (perceive-act loop) is decades old; what is new is LLM-driven, multi-step planning as the control core.
  • ReAct (Yao et al. 2022, arXiv:2210.03629) was the first to combine reasoning and acting in the same LLM loop and laid the conceptual foundation of modern agents.
  • Frontier LLM agents control sequence and tool choice dynamically (Maturity L4) and coordinate as multi-agent systems (L5).
  • MCP (Spec 2025-11-25, >10,000 MCP servers) standardizes the agent-to-tool connection; A2A (Linux Foundation since June 2025, 150+ organizations) the agent-to-agent connection.
  • Despite the hype, maturity remains low: according to McKinsey, only 23 percent are scaling at least one agentic use case in 2025, and Gartner expects over 40 percent of projects to be cancelled by the end of 2027.
  • Standardization and realism instead of agent-washing characterize the current phase: open protocols and outcome KPIs determine whether productive deployment succeeds.

The history of AI Agents stretches from the classic agent concept in AI research (a system that perceives its environment and reacts to it with actions), through the ReAct pattern (2022), which combines reasoning and acting in the same LLM loop, all the way to today's frontier LLM agents. Open standards such as MCP (2025) and A2A (2025) connect agents to tools and to one another.

The most important milestones at a glance:

  • Classic agent concept: An agent perceives its environment and acts on it in a goal-directed way (perceive-act). This basic principle is old; what is new is the LLM core as the control layer.
  • ReAct (2022): Reasoning and acting merge in the same LLM loop. The perceive-reason-act-observe pattern becomes the blueprint for modern agents.
  • Standardization (2025): MCP connects agents to tools, A2A connects agents to one another. Both become open protocols under the umbrella of the Linux Foundation.

The Classic Agent Concept: Perception and Action

The concept of the agent is considerably older than today's LLM wave. In classic AI, an agent refers to a system that perceives its environment and reacts to it with actions in order to pursue a goal. The simplest form is the reflex agent: rule-based, reactive, without a real model of the world. In today's maturity classification, this corresponds to level L1 (reflex), such as a FAQ bot or a thermostat.

These early agents already had two of the four properties of modern AI Agents in rudimentary form: perception of the environment and goal-oriented action. What was missing was a powerful reasoning core that could plan in multiple steps and select tools dynamically. It is exactly this gap that Large Language Models close.

The Turning Point: ReAct (2022)

The conceptual breakthrough came with the ReAct pattern (Yao et al. 2022, arXiv:2210.03629). ReAct stands for reasoning and acting in the same LLM loop: the model does not merely think about a task, but interweaves these considerations directly with actions, such as tool calls, and observes the results before planning further.

This gives rise to the reasoning loop that still forms the core of an agent today: Perceive → Reason → Act → Observe, repeated iteratively until the goal is reached or the run is aborted. ReAct provided the link between the old agent concept (perception-action) and the new world of language models. Only now were all four mandatory properties of an AI Agent in place: LLM-driven control, multi-step planning, tool use, and goal-oriented autonomy within guardrails.

From Augmented LLMs to Autonomous Agents

After ReAct, the agent landscape developed further along a maturity model that captures the increasing autonomy well:

Level

Designation

Characteristic

Example

L1

Reflex

Rule-based, reactive

FAQ bot, thermostat

L2

Augmented LLM

LLM plus single tool call, reactive

ChatGPT with web search

L3

Workflow agent

LLM in a deterministic pipeline (prompt chaining, routing)

structured processing chain

L4

Autonomous agent

LLM controls sequence and tool choice dynamically, full loop

Claude Code, Deep Research

L5

Multi-agent system

Several autonomous agents coordinate via A2A

orchestrator plus specialists

The decisive leap lies between L3 and L4. With a workflow agent, the path is defined in advance; the LLM merely fills gaps in a deterministic pipeline. An autonomous agent (L4), by contrast, decides for itself which tools it uses and in what order, and adjusts its plan iteratively. This is exactly where an agent pays off: when the solution path cannot be planned in advance.

A concrete example of L4 is a coding agent like Claude Code. It receives a goal (such as fixing a bug), independently reads files for this purpose, runs tests, observes the results, and decides on this basis which file to change next. The sequence of steps is not specified in a script but emerges within the loop. This is the practical difference between a classic workflow and a true agent.

Frontier LLMs as the New Agent Core

With the frontier LLMs of the more recent generation, the reasoning loop became robust enough for productive tasks. Five components make up an agent today: the LLM core for reasoning, the memory (short-term as context, long-term via vectors, RAG, or files), the tools (function calls, APIs, MCP servers, browser, code sandbox), the planner, which breaks a goal down into sub-steps, and the executor, which carries out the tool calls while monitoring turns, loop limits, and guardrails.

This architecture also explains why no dedicated, self-trained model is necessary: API LLMs provide the reasoning core, and the agent logic emerges in the layer above it.

2025: The Year of Standardization (MCP and A2A)

Until 2024, every tool integration was largely proprietary. 2025 brought the decisive standardization, and in two dimensions.

MCP (Model Context Protocol) standardizes the connection between agent and tool. The specification carries the version 2025-11-25; in December 2025, MCP was handed over to the Linux Foundation, or rather the Agentic AI Foundation. By now, more than 10,000 MCP servers exist. MCP thus answers the question of how an agent accesses external tools in a vendor-neutral way.

A2A (Agent-to-Agent) standardizes the connection between agents and thereby forms the technical foundation for multi-agent systems (L5). A2A has been with the Linux Foundation since June 2025 and is supported by more than 150 organizations. This makes the coordination of several specialized agents, such as orchestrator plus domain agents, interoperable across vendor boundaries.

These two protocols mark the transition from isolated individual agents to an open ecosystem.

Where We Stand Today: Hype Meets Reality

The historical arc ends, for now, in a phase of disillusionment and consolidation. Adoption is clearly rising, but productive maturity remains low.

According to McKinsey State of AI 2025, only 23 percent of companies are scaling at least one agentic use case, and a further 39 percent are experimenting; in no business function does the share of scaled use cases exceed 10 percent. Gartner expects (as of June 2025) that over 40 percent of agentic AI projects will be cancelled by the end of 2027. This is accompanied by agent-washing: according to Gartner, only about 130 vendors have genuine agent capabilities.

The lesson from this most recent history is pragmatic: it is not the framework but the use case that decides. An agent is worthwhile where the solution path cannot be planned in advance; everywhere else, a workflow or classic RPA is cheaper and more robust. Open standards such as MCP and A2A reduce lock-in to individual vendors and make architectures more future-proof.

Note: The regulatory and legal aspects mentioned are informational and do not constitute legal advice.

FAQ

When does the history of AI Agents begin?
The agent concept itself is old: in classic AI, an agent is a system that perceives its environment and acts in a goal-directed way. The modern, LLM-driven generation begins conceptually in 2022 with the ReAct pattern (Yao et al., arXiv:2210.03629).
What is the ReAct pattern and why is it a milestone?
ReAct (2022) combines reasoning and acting in the same LLM loop. The model interweaves its considerations directly with actions such as tool calls and observes the results. From this emerges the perceive-reason-act-observe loop, which still forms the core of modern agents today.
What distinguishes an autonomous agent from a workflow?
A workflow follows a predefined, deterministic path; the LLM merely fills gaps. An autonomous agent (Maturity L4), by contrast, decides for itself on sequence and tool choice and adjusts its plan iteratively. An agent only becomes worthwhile when the solution path cannot be planned in advance.
What are MCP and A2A and why were they so important in 2025?
MCP (Model Context Protocol, Spec 2025-11-25, over 10,000 servers) standardizes the agent-to-tool connection. A2A (Agent-to-Agent, Linux Foundation since June 2025, over 150 organizations) standardizes the agent-to-agent connection. Both mark the transition from isolated individual agents to an open ecosystem.
Do I need my own, self-trained LLM for an agent?
No. API LLMs provide the reasoning core as the LLM core. The agent logic (memory, tools, planner, executor) emerges in the layer above it. Training your own model is not required to operate an agent.
How mature are AI Agents really at the moment?
Adoption is rising, but productive maturity remains low. According to McKinsey State of AI 2025, only 23 percent of companies are scaling at least one agentic use case. Gartner expects (as of June 2025) that over 40 percent of agentic AI projects will be cancelled by the end of 2027.

Want to go deeper?

Get new analyses straight to your inbox – or see how we put this knowledge to work for companies.