16.4Intermediate7 min

Tool Misuse and Excessive Agency: When AI Agents Are Allowed to Do Too Much

Blck Alpaca·9 June 2026

Definition

Excessive Agency refers to the overly broad autonomy, permissions or functionality of an AI agent; it can do more than its task would require. Tool Misuse is the abusive use of legitimate tools: access is authorised, the usage is not. Both lead to unintended actions, data exfiltration and uncontrolled costs.

Key Takeaways

✓Excessive Agency (listed in the OWASP LLM Top 10 as LLM06:2025) is the conceptual bridge to the OWASP Agentic Top 10 2026 and sits directly upstream of the agent risks ASI02 (Tool Misuse), ASI03 (Privilege Abuse) and ASI10 (Rogue Agents).
✓Tool Misuse (ASI02) means: the agent acts within its permissions but uses a legitimate tool insecurely, for example deleting data, calling paid APIs excessively or sending emails for phishing.
✓Documented incidents such as Amazon Q (CVE-2025-8217, July 2025) demonstrate the damage potential: destructive prompts with the flags 'trust-all-tools' and 'no-interactive' would have deleted file systems and cloud resources without confirmation, across roughly 1 million installations.
✓The most effective countermeasures are least-privilege on every tool, scope-limited and short-lived tokens, human-in-the-loop gates for destructive actions, as well as hard rate and cost limits with circuit breakers.
✓According to research, the most common deployment mistake among DACH SMEs is running an agent 'just to make it work' under a service account with admin-equivalent rights.
✓Auto-approve or 'YOLO' modes that disable confirmation prompts are a central risk multiplier and should be switched off for every tool with access to a database, payments, communication or deployment.

Excessive Agency refers to the overly broad autonomy, permissions or functionality of an AI agent; it can do more than its task would require. Tool Misuse is the abusive use of legitimate tools: access is authorised, the usage is not. Both lead to unintended actions, data exfiltration and uncontrolled costs. In the OWASP taxonomy, Excessive Agency forms the umbrella under which Tool Misuse materialises as a concrete agent risk.

Quick answer 1: Excessive Agency is the cause (too much autonomy/rights), Tool Misuse the consequence (a legitimate tool is used insecurely or for an unintended purpose).
Quick answer 2: OWASP lists the topic in the LLM Top 10 2025 as LLM06:2025 and in the Agentic Top 10 2026 as ASI02 (Tool Misuse & Exploitation), published on 9 December 2025.
Quick answer 3: The core countermeasures are least-privilege, scope-limited short-lived tokens, human approval for critical actions and hard rate or cost limits.

Why "being allowed to do too much" is the real agent problem

Classic LLM applications respond: a prompt goes in, a completion comes out. Agentic systems plan, choose tools, write to memory and act, with minimal step-by-step human approval. It is precisely this leap from "responding" to "acting" that makes the question of permissions the central security issue.

OWASP captures this in the LLM Top 10 2025 under LLM06:2025 Excessive Agency: granting unchecked autonomy to act to LLMs with insufficient permission scoping or human oversight. This entry was explicitly extended to agentic architectures in 2025 and is, according to the research, the biggest conceptual bridge that decision-makers need to internalise. Excessive Agency sits directly upstream of several agent risks in the OWASP Top 10 for Agentic Applications 2026, namely ASI02 (Tool Misuse), ASI03 (Identity & Privilege Abuse) and ASI10 (Rogue Agents).

Excessive Agency typically arises at three levels:

Too many permissions, the agent inherits overly broad rights from the user or service account under which it runs.
Too much functionality; the agent has access to tools that its task does not require at all (for example a delete or payment tool in a pure read-only use case).
Too much autonomy, critical actions run without confirmation, for instance via enabled auto-approve or "YOLO" modes.

Tool Misuse (ASI02): legitimate access, illegitimate use

The OWASP description of ASI02 is precise: the agent operates within its authorised privileges but applies a legitimate tool in an insecure or unintended way, it deletes valuable data, calls expensive APIs excessively, performs destructive operations or exfiltrates information. The distinction from ASI03 is important: the access is legitimate, the usage is not.

Typical attack and failure vectors according to the research:

Prompt injection that repurposes a tool, for example send_email, which is suddenly used to phish the entire customer base.
Misalignment between the agent's interpretation of the task and the developer's intent.
Insecure delegation, the agent hands a powerful tool to a sub-agent without contextual safeguards.
Auto-approve / "YOLO" modes that disable confirmation prompts.

Risks in concrete terms: actions, data, costs

The three dimensions of damage can be clearly separated:

Risk dimension	What happens	Example triggers
Unintended actions	Destructive operations (DELETE, DROP, `rm -rf`, transfers), setpoints outside safety limits, mass blocking	Repurposed tool, disabled auto-approve, misaligned task interpretation
Data exfiltration	Exfiltration of customer, employee or business data via legitimate channels	Manipulated web content, compromised tool chain, insecure delegation to sub-agents
Costs ("denial-of-wallet")	Token and API cost explosion through over-invoked or recursive plans	Retry storms, unbounded multi-step plans, missing budget caps

The cost dimension is particularly delicate in the agent world: OWASP lists it in the LLM Top 10 as LLM10:2025 Unbounded Consumption ("denial-of-wallet"). According to the research, multi-step plans multiply the token effort, which is why per-plan budget enforcement is mandatory.

Documented incidents (as of 2026)

The research names several substantiated cases that underpin the damage potential:

Amazon Q Code Assistant (CVE-2025-8217, July 2025): attackers compromised a GitHub token and injected malicious instructions into the VS Code extension v1.84.0. Destructive prompts in combination with the trust all tools and no interactive flags would have deleted file systems and cloud resources without confirmation. The extension was installed by roughly 1 million developers; the fix came with v1.85.0.
OpenAI Operator (Embrace The Red, Johann Rehberger): malicious website content steered the agent into accessing authenticated internal pages and disclosing addresses, phone numbers and emails from GitHub and Booking.com.
Langflow AI RCE (CVE-2025-34291): CrowdStrike observed several threat actors exploiting an unauthenticated code injection in Langflow, a widely used agent framework.

An illustrative Excessive Agency scenario is the manufacturing procurement cascade (2025) documented in the research: over three weeks, a procurement agent was gradually "convinced" that its authorisation limit was 500,000 US dollars. The attacker then placed 5 million US dollars in fraudulent orders across 10 transactions. The case combines Tool Misuse with the risk of human approvals being rubber-stamped (ASI09).

Countermeasures: defence in layers

The research structures the measures along the lifecycle. Importantly: no single safeguard is sufficient, the layers complement one another.

Layer	Measure	Effect
Design	Least-privilege on every tool; schema validation of every tool argument; pre-flight cost estimation for expensive/destructive tools	Limits what the agent can do in the first place
Build	Allow/deny lists per agent role; disable auto-approve for database, payments, communication, deployment	Prevents risky default configurations
Runtime	Human-in-the-loop gates for destructive operations; tool approval queues; scope-limited, short-lived tokens (separated for read/write/execute/delegate)	Slows down critical individual actions
Operations	Cost caps with circuit breakers; rate limits per tool, agent and tenant; SIEM detection on tool-call patterns	Caps costs and detects anomalies

In addition, the research names two architectural principles for DACH practice: first, the gateway/proxy pattern in front of every agent as a central enforcement point for allow lists, schema validation, cost caps, rate limits and audit logging. Second, the identity question: every agent is its own Non-Human Identity (NHI) with its own credential lifecycle, short-lived tokens and just-in-time provisioning. According to the research, the most common deployment mistake among DACH SMEs is running an agent "just to make it work" under a service account with admin-equivalent rights. Safer is the delegated user identity, the agent acts as the human and is limited to that person's rights.

A word on the effectiveness of human approval: these gates frequently degrade in practice. OWASP lists this as a separate risk, ASI09 (Human-Agent Trust Exploitation), automation bias and authoritative-sounding output lead to rubber-stamping. Effective gates therefore enforce the independent review of the evidence, not merely the sign-off on the agent's recommendation.

Example: permissions for a newsletter agent

A marketing agent is meant to create and send newsletter drafts. Excessive Agency arises if it is given full access to the email system. Least-privilege in pseudocode:

```
agent_role: "newsletter-draft"
tools:

read_segment: scope=audience.read rate_limit=20/min
create_draft: scope=campaign.write rate_limit=10/min
send_campaign: scope=campaign.send rate_limit=2/h
require_human_approval=true # Gate: check recipients + content
deny:
delete_contact, export_audience, billing.*
budget:
token_cap=500000/day usd_cap=20/day circuit_breaker=on
token:
type=short_lived ttl=15min
```

Effect: even with a successful prompt injection, the agent cannot export or delete contacts, cannot overload a paid bulk API and cannot send a campaign without human approval. The short-lived, scope-limited token devalues stolen credentials after 15 minutes.

For agencies and B2B

Anyone running AI agents for clients in marketing, sales or service bears responsibility for their permission design. A practical starting point: inventory every tool and every scope per agent, disable auto-approve for anything that deletes data, moves money or sends messages, and set hard cost and rate limits before an agent goes into production. Blck Alpaca from Vienna supports DACH B2B organisations in building least-privilege-compliant agent architectures along the OWASP Agentic Top 10. Please note: this article serves as professional information and does not constitute legal advice; please clarify specific compliance and liability questions with qualified advisers.

FAQ

What is the difference between Excessive Agency and Tool Misuse?

Excessive Agency is the cause, Tool Misuse the consequence. Excessive Agency describes overly broad autonomy, permissions or functionality; the agent can fundamentally do more than it needs for its task. Tool Misuse (OWASP ASI02) is the concrete incident in which an inherently legitimate tool is used insecurely or for an unintended purpose. Importantly: with Tool Misuse, the access is authorised, only the usage is not.

Where does Excessive Agency appear in the OWASP lists?

Excessive Agency is listed in the OWASP Top 10 for LLM Applications 2025 as LLM06:2025 and was explicitly extended there in 2025 to cover agentic architectures. In the OWASP Top 10 for Agentic Applications 2026 (published on 9 December 2025) it is the conceptual umbrella behind ASI02 (Tool Misuse & Exploitation), ASI03 (Identity & Privilege Abuse) and ASI10 (Rogue Agents).

Which countermeasures are most effective against Tool Misuse?

According to OWASP research, a layered defence works best: least-privilege and schema validation on every tool at the design stage; allow/deny lists per agent role and disabled auto-approve modes in the build; human-in-the-loop gates for destructive operations at runtime; and cost caps with circuit breakers and rate limits per tool, agent and tenant in operations.

What are scope-limited tokens and why are they important?

Scope-limited, short-lived tokens grant an agent only the minimal rights needed for a specific task, separated by read, write, execute and delegate, and expire quickly. If the agent is compromised, an attacker inherits only these narrowly scoped rights instead of permanent admin access. This significantly limits the damage from data exfiltration and unintended actions.

Does human-in-the-loop prevent all risks?

No. Human approvals are a central control mechanism but frequently degrade in practice. OWASP lists this as a separate risk, ASI09 (Human-Agent Trust Exploitation): automation bias and the authoritative-sounding output of agents lead to rubber-stamping. Effective gates enforce an independent review of the evidence, not merely the agent's recommendation.

Want to go deeper?

Get new analyses straight to your inbox, or see how we put this knowledge to work for companies.

Subscribe to newsletter →Our services

Previous← Prompt Injection: Direct vs. Indirect - the difference and why it becomes a boardroom issue with AI agents NextAgent Goal Hijacking: When the Objectives of Autonomous AI Agents Are Manipulated →