OWASP LLM Top 10 (2025) explained: The ten security risks for LLM applications
The OWASP LLM Top 10 (2025) are the ten most serious security risks for applications built on large language models, published by the OWASP GenAI Security Project. They range from Prompt Injection through Sensitive Information Disclosure to Unbounded Consumption and form the reference for securing LLM and AI-agent systems.
Key Takeaways
- ✓The OWASP LLM Top 10 (2025) were published by the OWASP GenAI Security Project and supersede the 2023/24 list; they apply to every GenAI application from chatbots through RAG to AI agents.
- ✓LLM01 Prompt Injection is the single most consequential risk; the first real-world zero-click case was EchoLeak (CVE-2025-32711, CVSS 9.3) in Microsoft 365 Copilot, disclosed by Aim Labs in June 2025.
- ✓Three entries are new or substantially expanded in 2025: LLM07 System Prompt Leakage, LLM08 Vector and Embedding Weaknesses, and LLM10 Unbounded Consumption (denial-of-wallet).
- ✓LLM06 Excessive Agency is the conceptual bridge to agent security: it sits directly upstream of several agent risks in the OWASP Agentic Top 10 (2026).
- ✓Anyone running AI agents covers only the baseline with the LLM Top 10. Autonomy, tool use and persistent memory expand the attack surface and additionally require the OWASP Agentic Top 10 (ASI01-ASI10).
- ✓Guardrails are no panacea: EchoLeak bypassed Microsoft's XPIA classifier. Effective protection requires layered defence comprising input filtering, scope enforcement, output filtering and behavioural monitoring.
The OWASP LLM Top 10 (2025) are the ten most serious security risks for applications built on large language models, published by the OWASP GenAI Security Project. They range from Prompt Injection through Sensitive Information Disclosure to Unbounded Consumption and form the reference against which DACH security teams should measure every LLM and AI-agent architecture.
- What is it? The most widely used consensus list of the most serious LLM security risks, applicable to chatbots, RAG systems, copilots and agents.
- Who is behind it? The OWASP GenAI Security Project (formerly OWASP Top 10 for LLM Applications, project home genai.owasp.org). The 2025 edition supersedes the 2023/24 list.
- Why relevant now? Anyone deploying AI agents covers only the baseline with it. For autonomous, tool-using systems, the OWASP Agentic Top 10 (ASI01-ASI10) has additionally been available since 9 December 2025.
Positioning: which OWASP list is which?
OWASP now publishes a whole family of overlapping AI security artefacts that procurement, audit and risk teams regularly confuse in practice. For DACH B2B decision-makers, the distinction is crucial:
- OWASP Top 10 for Large Language Model Applications 2025 (LLM01:2025 - LLM10:2025): The list addressed here. It tackles risks at the model and application level in every GenAI/LLM application. It is the reference that most DACH security teams already know.
- OWASP Top 10 for Agentic Applications 2026 (ASI01 - ASI10): Published by the same project on 9 December 2025, developed under the Agentic Security Initiative. Addresses risks arising from autonomy, tool integration and multi-agent coordination.
- OWASP AI Exchange, AI Security and Privacy Guide, MCP Top 10, AIVSS: Complementary building blocks (knowledge base, protocol layer, scoring system) that flank the Top 10 lists but do not replace them.
Important: the LLM Top 10 build on classic application security standards such as OWASP ASVS and the API Security Top 10, not in their place. An LLM application still needs authentication, session management, input validation and crypto hygiene. The temptation to treat AI security as a separate silo is one of the most common conceptual mistakes.
The ten risks at a glance
The following table summarises all ten entries with a brief description and the central countermeasure.
Risk (2025) | Brief description | Countermeasure |
|---|---|---|
LLM01 Prompt Injection | Direct and indirect injection of instructions into the LLM input that override the intended behaviour. The single most consequential risk. | Treat external content as untrusted, separate the instruction and data channels, input/output filtering, provenance-based access control. |
LLM02 Sensitive Information Disclosure | Disclosure of PII, intellectual property, trade secrets, system prompts and embeddings in LLM outputs; includes PII inversion and embedding extraction. | Data sanitisation, output filtering, DLP integration. |
LLM03 Supply Chain | Compromise of base models, fine-tuned weights, training data, model registries and dependencies. Static, i.e. at build time. | Pin dependencies, provenance verification, scanning before deployment, SBOM/AIBOM. |
LLM04 Data and Model Poisoning | Manipulated training, fine-tuning or RAG data that biases the model or embeds backdoors; including backdoored open-source models. | Verify data quality and provenance, source trust assessment, validation before embedding. |
LLM05 Improper Output Handling | Downstream systems treat LLM output as trusted input; XSS, SQLi, SSRF, command injection via the model output. | Strictly validate and encode output, never execute output unchecked as code/query. |
LLM06 Excessive Agency | LLMs are granted uncontrolled power to act with insufficient permission limits or human oversight. Expanded in 2025 for agent architectures. | Least privilege per tool, explicit permission scopes, human-in-the-loop for critical actions. |
LLM07 System Prompt Leakage | System prompt contents that were assumed private are disclosed through probing. | Do not store secrets in the system prompt; do not base security on prompt confidentiality. |
LLM08 Vector and Embedding Weaknesses | RAG-specific: injection of malicious data into vector stores, embedding poisoning, cross-tenant embedding leaks. New in 2025. | Vector store access controls, tenant isolation, content validation before embedding. |
LLM09 Misinformation | Confidently phrased false content from hallucinations or biased training data. | Grounding via trusted sources, fact-checking, flagging uncertainty. |
LLM10 Unbounded Consumption | Resource and cost exhaustion (denial-of-wallet). Expanded in 2025 from the 2023 DoS entry. | Hard cost caps with circuit breakers, rate limits per user/tenant/agent, token anomaly detection. |
Three entries are new or substantially changed in 2025
Compared with the predecessor list, OWASP has noticeably adapted the 2025 edition to current architectures. Three items deserve particular attention:
- LLM07 System Prompt Leakage has been included as a separate entry because real-world exploits showed that supposedly private system prompts can be extracted via targeted probing. The lesson: the system prompt is not a vault. Secrets, credentials or security logic do not belong in it.
- LLM08 Vector and Embedding Weaknesses is new and reflects the RAG boom. Malicious data in vector stores, embedding poisoning and cross-tenant embedding leaks are real risks as soon as multiple customers share the same vector index.
- LLM10 Unbounded Consumption expands the old denial-of-service item with the cost dimension (denial-of-wallet). With agents the amplification is particularly critical, because multi-step plans multiply token consumption.
The link to AI agents: LLM06 as the bridge
For agencies and B2B decision-makers, the most important conceptual insight is this: the LLM Top 10 were written for systems that at their core respond. A prompt comes in, a completion goes out, possibly grounded by RAG. Agentic systems, by contrast, plan, reason, select tools, write to memory and act with minimal human approval per step.
OWASP puts it this way (in essence, Sotiropoulos et al., 9 December 2025): agentic systems inherit all LLM risks and introduce entirely new vulnerability classes arising from autonomy, tool integration, multi-agent coordination and persistent state. The bridge between the two worlds is LLM06 Excessive Agency, which was explicitly expanded in 2025 for agentic architectures and sits directly upstream of several agent risks (ASI02, ASI03, ASI10).
The open-source red-teaming framework DeepTeam, which maps both lists, describes the amplification effect succinctly: ASI01 (Agent Goal Hijack) corresponds to LLM01 (Prompt Injection) times LLM06 (Excessive Agency), but with multi-step execution that amplifies the damage beyond a single response. Three agentic risks (ASI07 Insecure Inter-Agent Communication, ASI08 Cascading Failures, ASI10 Rogue Agents) have no LLM Top 10 counterpart at all. So anyone reading only the LLM Top 10 systematically underestimates agentic risk.
A concrete example: EchoLeak (CVE-2025-32711)
How far prompt injection reaches in production systems is shown by EchoLeak, the first real-world documented zero-click case in a production LLM system. In June 2025, Aim Labs disclosed the vulnerability in Microsoft 365 Copilot (CVE-2025-32711, CVSS 9.3, documented in arXiv 2509.10540, Reddy et al., September 2025).
The sequence in pseudo-steps:
```
- The attacker sends a single crafted email to the victim.
- The hidden instructions bypass Microsoft's XPIA classifier
(Cross-Prompt Injection Attempt). - Link redaction is defeated via reference-style Markdown.
- Auto-loaded images + a Microsoft Teams proxy allowed by CSP
exfiltrate the most sensitive content from the Copilot context. - Not a single user click required (zero-click).
```
Aim Labs coined the term LLM Scope Violation for this. Microsoft patched server-side without any action by customers. The central lesson for DACH risk committees: even an established vendor classifier (XPIA) was bypassed. In the sister case CamoLeak (GitHub Copilot Chat, CVSS 9.6, disclosed in October 2025 by Omer Mayraz/Legit Security following a HackerOne report in June 2025), the CSP protection was defeated via GitHub's own Camo image proxy in order to siphon private repository secrets character by character; GitHub subsequently disabled image rendering in Copilot Chat entirely on 14 August 2025.
Guardrails are therefore no panacea. Every guardrail published to date has been bypassed by competent researchers within months of release. Effective protection requires layered defence: input-side filtering plus scope/provenance enforcement plus output-side filtering plus behavioural monitoring. Vendor claims such as "our guardrail blocks 99.x per cent of prompt injections" should be treated as marketing until an independent red team has verified them.
Detection and defence in practice
For LLM applications, defence-in-depth means in concrete terms:
- Input side: Treat external content (documents, emails, RAG corpus, web pages, tool outputs) as untrusted by default. Separate the instruction and data channels. Deploy input filters such as Llama Guard, Microsoft Prompt Shield, NVIDIA NeMo Guardrails or Lakera Guard (as of 2026).
- Runtime: Provenance-based access control, so that content marked as external cannot trigger privileged data access. Restrict Markdown rendering, prevent auto-fetching of images.
- Output side: Check output against expected patterns and never execute it unchecked as code, SQL or HTML in downstream systems (LLM05).
- Operations: Continuous red-teaming with Garak, PyRIT or DeepTeam. Hard cost caps and rate limits against Unbounded Consumption. Complete forensic logging.
An honest framing is part of this: guardrails introduce latency (typically 100 to 500 milliseconds per rule in production) and cost (each rule is another inference call). In multilingual DACH contexts (DE/FR/IT/EN mixed) the false-positive rates remain high.
For agencies and B2B
Marketing agencies and the DACH SME sector increasingly deploy LLM-powered copilots, RAG assistants and first agents in production, often as a bought-in SaaS feature in Microsoft 365 Copilot, ChatGPT Enterprise, Claude for Work or Gemini for Workspace (as of 2026). It is precisely here that LLM01 (indirect injection via content), LLM02 (data disclosure in summaries) and LLM06 (overly broad tool permissions) constitute the realistic attack surface.
Three pointers for procurement and governance: first, "OWASP-compliant" is not a meaningful claim, since OWASP publishes guidance and not certifications. Second, every relevant vendor guardrail should be set to its strictest level and vendor logs exported into a SIEM. Third: anyone planning agents must additionally read the OWASP Agentic Top 10 alongside the LLM Top 10, otherwise the risk model remains incomplete. Blck Alpaca, based in Vienna, supports the positioning of these risks within EU AI Act, GDPR and ISO 42001 contexts and the selection of appropriate protection layers. The OWASP lists are updated at least annually; genai.owasp.org belongs as a bookmark in every risk process.
FAQ
What are the OWASP LLM Top 10 (2025)?
How do the OWASP LLM Top 10 differ from the OWASP Agentic Top 10?
What is the difference between LLM01 Prompt Injection and a jailbreak?
Is an LLM application fully secured with the OWASP LLM Top 10?
What does LLM10 Unbounded Consumption mean?
Want to go deeper?
Get new analyses straight to your inbox – or see how we put this knowledge to work for companies.