Skip to content
12.2Intermediate7 min

GDPR Legal Basis for AI Agents (Art. 6): When Consent, Contract or Legitimate Interest Applies

Blck Alpaca·
Definition

The GDPR legal basis for AI determines which ground under Art. 6(1) GDPR a company relies on for data processing by an AI agent. For internal deployments, fine-tuning, RAG and B2B service agents, legitimate interest (Art. 6(1)(f)) is the dominant basis, alongside consent, contract or legal obligation.

Key Takeaways

  • Every processing of personal data by an AI agent - prompts, outputs, agent memory, RAG indexes, tool-call payloads, logs - requires its own legal basis under Art. 6(1) GDPR.
  • Legitimate interest (Art. 6(1)(f)) is in principle available according to EDPB Opinion 28/2024, but requires the three-step test: purpose, necessity, balancing.
  • Contract (lit. b) only covers what is strictly necessary to perform the contract - fine-tuning on customer data is rarely contractually necessary.
  • Publicly accessible data is no free pass: Clearview AI was fined, among others, 20 million euros each in Italy and France; a separate balancing exercise remains mandatory.
  • Sensitive data (Art. 9) is generally prohibited - in addition to Art. 6, a separate exemption under Art. 9(2) is required.
  • For automated individual decisions (Art. 22), the requirements tighten considerably - covered by its own topic cluster.

The GDPR legal basis for AI determines which ground under Art. 6(1) GDPR a company relies on for data processing by an AI agent. For internal deployments, fine-tuning, RAG and B2B service agents, legitimate interest (Art. 6(1)(f)) is the dominant basis, alongside consent, contract or legal obligation. Without a viable legal basis, any processing is unlawful.

  • Legitimate interest (lit. f) is the most common basis for internal AI agents, fine-tuning, RAG, fraud detection and ordinary B2B service agents - but it requires the three-step test.
  • Contract (lit. b) only covers what is strictly necessary to perform the contract; consent (lit. a) suits opt-in features but hardly scales to training datasets.
  • Sensitive data (Art. 9) and automated individual decisions (Art. 22) trigger additional requirements that go beyond Art. 6.

Why every AI agent needs a legal basis

Under Art. 4(1)/(2) GDPR, any operation on personal data triggers the Regulation. In an agentic system, almost every touchpoint is a processing operation: prompts and system prompts, outputs (including hallucinated personal data), agent memory and vector stores, tool-call payloads to internal APIs or MCP servers, multi-agent messages, as well as logs and traces. For each of these data flows, the controller needs a legal basis under Art. 6(1).

Important: prompts and outputs are almost always personal as soon as they name or sufficiently characterise a person - including fabricated content. Both the Hamburg ChatGPT complaint and the Italian Garante's proceedings against OpenAI confirm that fabricated personal data remains personal data. According to CNIL and Hamburg discussion papers, embeddings are by default regarded as pseudonymous personal data, since text-inversion attacks enable re-identification.

The six legal bases and their fit for AI agents

Art. 6(1) GDPR recognises six grounds. For AI agents, three are dominant in practice - consent (lit. a), contract (lit. b) and legitimate interest (lit. f) - supplemented by legal obligation (lit. c).

Legal basis (Art. 6(1))

When appropriate

Requirements / caveats

(a) Consent

End-customer assistants, voice cloning, non-essential analytics, biometric agents

Freely given, specific, informed, unambiguous (Art. 7); revocable; hardly feasible for training datasets; presumed not freely given in the employment context (BAG 14.12.2023 - 6 AZR 199/22)

(b) Performance of a contract

Service agent for existing customers, employment-related agents, agent-assisted order processing

Narrow: only what is necessary to perform the contract to which the data subject is party; fine-tuning on customer data rarely contractually necessary

(c) Legal obligation

KYC/AML agents, sanctions screening, tax classification, audit trails (Section 257 HGB, Section 147 AO)

Requires a sufficiently specific EU or member-state law; generic "compliance" is not enough

(d) Vital interests

Almost never a dominant basis for agents

Only for life-and-death cases

(e) Public task

Public-sector agents (Section 3 BDSG, Austrian DSG, Swiss DSG)

Requires a statutory basis for the task

(f) Legitimate interest

Dominant basis for internal deployments, training, fine-tuning, RAG, behavioural analytics, fraud detection, cybersecurity, ordinary B2B service agents

Three-step test; particularly restricted under Art. 22; not available to public authorities in the performance of their tasks

The three-step test for legitimate interest

According to the EDPB Guidelines 1/2024 (adopted on 8 October 2024) and EDPB Opinion 28/2024 (adopted on 17 December 2024), controllers must document three steps:

  1. Purpose test. The interest must be lawful, real, present and specifically articulated. "Improving AI products" in the abstract is not enough; "improving the classification accuracy of the internal HR screening agent for our employees" is.
  2. Necessity test. Could the purpose be achieved with less data or anonymous data? For RAG, this often argues in favour of pseudonymising indexed documents and minimising context payloads.
  3. Balancing test. Weighed against the reasonable expectations of data subjects, the nature of the data, the relationship, and risks such as re-identification, scope creep and profiling. The EDPB criteria from Opinion 28/2024 (publicly available?; nature of the service?; awareness of being online?) apply.

EDPB Opinion 28/2024 confirms that legitimate interest is in principle available for development and deployment. As potentially legitimate, it cites, for example, conversational assistants, fraud detection and cybersecurity. On 19 June 2025, the CNIL added a taxonomy of presumptively legitimate AI purposes: scientific research, providing public access, conversational assistants, improving the performance of an existing product, and fraud prevention.

Application to typical agent scenarios

  • Fine-tuning on internal customer data: typically Art. 6(1)(f). The purpose must be narrower than "improving our AI", an LIA (legitimate interest assessment) must be carried out, supplemented by a compatibility analysis under Art. 5(1)(b). Where sensitive data is involved, an Art. 9 exemption is additionally required.
  • RAG over employee documents: usually lit. b or lit. f. If the agent measures employee behaviour, co-determination is required (Section 87(1) no. 6 BetrVG, Austrian ArbVG, Swiss ArG).
  • Cross-session agent memory: the legal basis must cover secondary storage; legitimate interest with a clear retention policy and opt-out is the usual pattern, documented under Art. 5(1)(b).
  • Training foundation models on broadly scraped corpora: EDPB Opinion 28/2024 sets a high bar here. Without a documented LIA, a web-scraping exclusion policy and a robust "disproportionate effort" justification under Art. 14(5)(b), deployers should not run large-scale training on scraped data.

The "publicly available" trap

A persistent misconception is that publicly accessible web data is exempt from the GDPR. Clearview AI was sanctioned for precisely this reason - in Italy with 20 million euros (10 February 2022), in France with 20 million euros (17 October 2022) plus a 5.2 million euros periodic penalty payment, as well as in Greece, the United Kingdom and the Netherlands. Public availability means neither consent nor any other legal basis. The Meta v. Bundeskartellamt judgment (C-252/21 of 4 July 2023, paras. 85-89) confirms: the "manifestly made public" exemption under Art. 9(2)(e) is a very narrow exception, and Art. 6(1)(f) still requires a separate balancing exercise.

Sensitive data: Art. 9 as a second gateway

Art. 9(1) generally prohibits the processing of sensitive data - including health data, biometric data for the purpose of unique identification, data concerning sexual orientation, and trade-union membership. A legal basis under Art. 6 alone is not sufficient; an exemption under Art. 9(2) is additionally required. The practically relevant ones are: explicit consent (lit. a), employment and social security law (lit. b, e.g. Section 26(3) BDSG), substantial public interest (lit. g), and healthcare (lit. h/i).

The AI Act special rule in Art. 10(5) permits the processing of sensitive data for bias detection in high-risk systems - but it does not constitute a stand-alone Art. 9 exemption. The controller still needs a gateway under Art. 9(2), typically lit. g, which in most member states still requires national legislation. This legislative gap exists as of 2026, subject to change; until then, providers should base Art. 10(5) processing on a documented lit. g justification and pseudonymise as early as possible.

Automated individual decisions (Art. 22) - briefly only

If the AI agent makes a solely automated decision with legal or similarly significant effect, Art. 22 tightens the requirements considerably. Following the CJEU judgments in SCHUFA (C-634/21, 7 December 2023) and Dun & Bradstreet Austria (C-203/22, 27 February 2025), even a probability score already counts as a "decision", and data subjects are entitled to an intelligible, contestable explanation of the logic. Details are covered by the dedicated cluster article on Art. 22.

Concrete example with figures

A DACH mid-sized company (around 800 employees) deploys an internal lead-scoring agent that classifies CRM records. The processing is based on Art. 6(1)(f). The LIA documents the purpose narrowly ("prioritising incoming B2B enquiries"), examines necessity (pseudonymisation of the indexed fields) and the balancing exercise. As part of data minimisation, the context payload per tool call is reduced from an original 42 CRM fields to 7 necessary fields, and prompt retention is limited to 30 days with redaction before permanent storage - aligned with OpenAI's API default retention of at most 30 days (Anthropic API logs: 7 days since 14 September 2025, previously 30 days). As soon as the score automatically rejects an enquiry, the scenario moves into the scope of Art. 22 - at which point a final human decision and explainability become mandatory.

For agencies and B2B decision-makers

Anyone deploying AI agents in production should determine the legal basis before go-live and document it per data flow - not justify it after the fact. The Italian Garante fined OpenAI 15 million euros (decision of 2 November 2024, published on 20 December 2024), among other things, because data was processed without a previously clarified legal basis (breach of Art. 5(1)(a), Art. 5(2) and Art. 6). For agencies, this means: a clean LIA, a layered privacy notice with an AI disclosure, and a prominent opt-out are part of the standard deliverables of every AI project. Blck Alpaca supports DACH companies in choosing the appropriate legal basis, documenting balancing exercises in an audit-proof manner, and setting up AI agents in a GDPR-compliant way from the outset.

Note: This article is intended for professional guidance and does not constitute legal advice. For specific individual cases, we recommend consulting data protection officers or a specialised law firm.

FAQ

Which legal basis applies to an internal AI agent that accesses employee or customer data?
As a rule, legitimate interest under Art. 6(1)(f) GDPR. There must be a specifically articulated purpose (not "improving AI"), a necessity test and a documented legitimate interest assessment (LIA). For RAG over employee documents, lit. b (contract) may also be relevant; if the agent measures employee behaviour, the co-determination of the works council or staff representation is required (Section 87(1) no. 6 BetrVG, ArbVG).
Can I use consent as the legal basis for training an AI model?
Rarely in practice. Consent under Art. 6(1)(a) must be freely given, specific, informed, unambiguous (Art. 7) and revocable - hardly feasible for training datasets. In the employment context, it is presumed not to be freely given according to the Federal Labour Court (BAG, 14.12.2023 - 6 AZR 199/22), unless this can be demonstrated. For opt-in features such as voice cloning of a well-known speaker, it may be appropriate.
Is publicly accessible web data sufficient as a legal basis for AI training?
No. Public availability means neither consent nor any other legal basis. Clearview AI was sanctioned, among others, in Italy (20 million euros, 10.02.2022) and France (20 million euros, 17.10.2022 plus a 5.2 million euros periodic penalty payment). Following Meta v. Bundeskartellamt (C-252/21), a separate balancing exercise remains mandatory under lit. f; the "manifestly made public" exemption under Art. 9(2)(e) is very narrow.
What happens if an AI agent processes sensitive data under Art. 9?
Art. 9(1) generally prohibits the processing of, among others, health, biometric, trade-union or sex-life data. In addition to the legal basis under Art. 6, a separate exemption under Art. 9(2) is required - such as explicit consent (lit. a), employment law (lit. b) or substantial public interest (lit. g). The AI Act special rule in Art. 10(5) does not replace this exemption.
Is a contract with the customer sufficient as a legal basis for any service agent?
Only to a limited extent. Art. 6(1)(b) covers exclusively what is necessary to perform the contract to which the data subject is party. A service agent answering a specific customer enquiry is usually covered. Fine-tuning on customer data or cross-session memory, by contrast, are rarely contractually necessary and typically require legitimate interest plus a compatibility assessment (Art. 5(1)(b)).

Want to go deeper?

Get new analyses straight to your inbox – or see how we put this knowledge to work for companies.