Skip to content
12.1Intermediate8 min

Data Processing Agreements under Art. 28 GDPR with AI Providers: The DPA Guide

Blck Alpaca·
Definition

A data processing agreement (DPA) under Art. 28 GDPR is mandatory as soon as an AI provider processes personal data on your behalf under instruction – for example via prompts, embeddings or logs. It bindingly governs instruction-bound processing, security, sub-processors, audit rights and deletion at the end of the contract.

Key Takeaways

  • A DPA under Art. 28 GDPR is mandatory as soon as an AI provider processes personal data (prompts, inference inputs/outputs, agent memory, tool-call payloads, logs, vector stores) as a processor on behalf of the controller.
  • Art. 28(3)(a)-(h) prescribes eight minimum contents – instruction-bound processing, confidentiality, Art. 32 security, sub-processor provisions, support for data subject rights, cooperation on security/notification/DPIA, as well as deletion and audit rights.
  • The major providers (Microsoft, OpenAI, Anthropic, Google, AWS) offer standard DPAs with a contractual 'no training' commitment for enterprise/API data – these are contractual commitments, not statutory defaults, and they reserve narrow own purposes (abuse monitoring, safety).
  • Modern agents create 5- to 8-tier sub-processor cascades (model, cloud, vector store, memory, MCP server, observability) – each tier requires a complete DPA chain under Art. 28(4).
  • Most common audit findings: undocumented MCP flows, observability providers without a DPA, vague 'AI services' descriptions and log retention conflicts (provider 30 days vs. desired 7 days).
  • Breaches of Art. 28 fall within the GDPR fine framework of up to EUR 20 million or 4% of global annual turnover – the Garante proceedings against OpenAI (EUR 15 million, decision 2 November 2024) demonstrate enforcement practice.

A data processing agreement (DPA; in German "Auftragsverarbeitungsvertrag", AVV) under Art. 28 GDPR is mandatory as soon as an AI provider processes personal data on your behalf under instruction. With AI agents this is practically always the case: prompts, inference inputs and outputs, agent memory, tool-call payloads, logs and vector stores continuously carry personal references. The DPA bindingly governs what the provider may do with this data – and what it may not.

The three most important answers up front:

  • The DPA obligation arises with instruction-bound processing. If the provider does not itself determine the purpose and essential means, it is a processor under Art. 4(8) GDPR – and a DPA is legally required.
  • Eight minimum contents are non-negotiable. Art. 28(3)(a)-(h) defines the mandatory canon; for AI, provider-specific clauses (no training, sub-processor map, data residency, log retention) are added.
  • The chain must be complete. Every sub-processor in the typical 5- to 8-tier agent cascade requires its own, equivalent binding under Art. 28(4).

Controller or processor? Classifying the AI provider

Under Art. 4(7)/(8) GDPR and EDPB Guidelines 7/2020, the controller determines the purposes and essential means of the processing; the processor acts on documented instructions. Classifying an AI provider hinges on three questions:

  • Who determines the purposes? If the provider uses prompts to improve its own models, it slips into the role of controller for this purpose.
  • Who determines the essential means? The processor may choose technical means; the essential means remain with the controller.
  • Practical anchor points: use of training data, selection of sub-processors, retention defaults, access for abuse monitoring.

Most enterprise contracts (Microsoft Azure OpenAI, OpenAI Enterprise/API, Anthropic Claude for Work/Enterprise, Google Vertex AI, AWS Bedrock, Mistral, Aleph Alpha) are set up as a controller-to-processor constellation – with an explicit commitment not to train on customer data. Important: these are contractual commitments, not statutory minimum standards. Providers typically reserve narrow rights (abuse monitoring, safety review, aggregated analyses) for which they may act as controller themselves for these limited purposes.

The eight mandatory contents under Art. 28(3) GDPR

An effective DPA must govern the following points:

  • (a) Subject matter, duration, nature, purpose of the processing, data types, categories of data subjects as well as the obligations and rights of the controller.
  • (b) Processing exclusively on documented instructions.
  • (c) Confidentiality obligation of the personnel deployed.
  • (d) Security measures under Art. 32.
  • (e) Sub-processor provisions (authorisation under Art. 28(2), notification of changes).
  • (f) Support in fulfilling data subject rights.
  • (g) Cooperation on security, data breach notification and DPIA (Art. 32-36).
  • (h) Deletion or return of the data at the end of the service as well as inspection and audit rights.

For AI providers, the following clauses must additionally be negotiated: no training/fine-tuning of the provider's models on customer data without explicit consent; no human review of prompts/outputs except via defined safety/abuse paths with logged access; a sub-processor map with named entities, location, function and sub-sub-processors; data residency commitments (EU Data Boundary, Swiss-DPF); configurable log retention including zero-data-retention modes; notification of changes to the training policy with opt-out; as well as liability for sub-processor breaches pursuant to Art. 28(4).

Instruction-bound processing, sub-processors and audit rights

The instruction-bound processing ((b)) is the core: the provider may only process personal data as you instruct in documented form. Any unilateral change of purpose – for example model improvement from your prompts – leaves the scope of processing on behalf.

The greatest practical effort arises with the sub-processors. A modern agent deployment typically comprises a five- to eight-tier cascade:

  1. Foundation model providers (OpenAI, Anthropic, Mistral, Aleph Alpha, Google, Cohere)
  2. Hosting/cloud providers (Azure, AWS, GCP, IONOS, STACKIT, Swisscom, Open Telekom Cloud)
  3. Orchestration/agent framework runtime
  4. Vector store/RAG infrastructure (Pinecone, Weaviate, Qdrant, Milvus, pgvector)
  5. Memory providers (mem0, Letta, in-house)
  6. MCP servers (each external server is, depending on the data flow, a processor or controller)
  7. Observability (Langfuse, LangSmith, Helicone, Datadog)
  8. Evaluation/red-team services

For each tier, the deployer must verify that a DPA exists, that the chain is unbroken under Art. 28(4) and that the data residency commitments are consistent. The most common audit findings: undocumented MCP server flows, observability providers without a DPA and evaluation services that collect prompt traces.

With the audit rights ((h)), pure "summary-report-only" patterns are insufficient as soon as high-risk data is involved. The BfDI, BayLDA, HmbBfDI and NRW LDI routinely require the DPA plus the specific configuration of telemetry, training opt-out and region settings.

Status of major providers' DPAs (as of May 2026)

The following table is a research summary as of 14 May 2026, not legal advice; provider terms change frequently and must be re-validated when concluding a contract.

Provider

Standard DPA

"No training on customer data"?

Data residency EU/CH

Retention default

Microsoft 365 Copilot / Azure OpenAI

Microsoft Products and Services DPA

Yes, contractually: prompts, completions, embeddings, fine-tuning data not used for model training without explicit instruction; Anthropic models in M365 Copilot (from Jan 2026) excluded from the EU Data Boundary

EU Data Boundary commitment (in-region storage/processing); fine-tuning possibly cross-region

Configurable; in-tenant default; abuse monitoring logs up to 30 days unless opted out

OpenAI API (Enterprise/Business/API)

OpenAI DPA (effective from 1 Jan 2026; OpenAI Ireland Ltd as EEA/CH contracting party)

Yes: API and ChatGPT Enterprise/Team customer data not used for training by default

EU data residency for eligible enterprise/edu accounts and certain API setups; otherwise global routing

API data max. 30 days; enterprise for the contract duration; zero data retention available contractually

Anthropic Claude for Work/Enterprise/API

Anthropic Commercial Terms + DPA

Yes: commercial inputs/outputs not used for training by default (consumer tier differs, not enterprise)

Direct API "us"/"global"; EU residency via AWS Bedrock EU or Google Vertex AI EU

API logs: 7 days (since 14 Sep 2025, previously 30); ZDR via DPA

Google Vertex AI (Gemini, Claude on Vertex)

Google Cloud DPA + Vertex AI Terms

Yes: customer data not used for training Google or third-party models

EU regional endpoints (10 EU regions for Claude on Vertex); region guarantee

Configurable; short default

AWS Bedrock (Anthropic, Meta, Mistral, Cohere, Amazon)

AWS GDPR DPA + Service Terms

Yes: prompts/outputs not used for provider model training; models isolated from provider services

EU regions; European Sovereign Cloud (eusc-de-east-1, Jan 2026) – Claude not yet available there

Configurable; Bedrock does not log prompts/outputs by default

Sovereign alternatives governed by German/Swiss DPA law (Aleph Alpha Pharia, IONOS AI Model Hub, STACKIT, T-Systems Open Telekom Cloud, Swisscom Sovereign AI) offer exclusive DE/EU or CH processing and customer-controlled retention. DeepSeek is not recommended for regulated DACH sectors due to PRC jurisdiction and a Garante information request (January 2025).

Common DPA mistakes in AI projects

  • Vague processing descriptions ("AI services" instead of specific models, prompts, embeddings, fine-tuning).
  • Out-of-date sub-processor lists – MCP servers and observability providers are missing.
  • Weak audit rights for high-risk data.
  • Training data ambiguity – "We do not train on your data" is blurred with "We may use it for safety/improvement".
  • Log retention conflicts – provider retention exceeds your own policy (e.g. 30 days vs. desired 7 days).
  • Cross-border routing without explicit residency determination.
  • Missing upstream due diligence contrary to EDPB Opinion 28/2024 (review of the lawfulness of the upstream training data).

DPA review checklist

Worked example with figures: Mid-sized Copilot rollout

A company with 800 employees introduces Microsoft 365 Copilot and a customer service chatbot on Azure OpenAI. This results in at least two main DPAs (Microsoft, OpenAI/Azure) plus a sub-processor chain. Specifically to review: Microsoft Products and Services DPA in force, the Anthropic-as-sub-processor flow documented (Anthropic excluded from the EU Data Boundary since early 2026 – opt-in/opt-out by the deployer); the Azure OpenAI sub-processor list (as of April 2025) integrated into the record of processing activities; all MCP/connector flows mapped; abuse monitoring logs limited to a maximum of 30 days, OpenAI API data to 30 days, Anthropic API logs to 7 days. The risk of failure: breaches of Art. 28 fall within the GDPR fine framework of up to EUR 20 million or 4% of global annual turnover. At EUR 50 million in turnover, this would mathematically amount to up to EUR 2 million. That the supervisory authorities take action is shown by the Garante proceedings against OpenAI: a EUR 15 million fine (decision 2 November 2024, published 20 December 2024) for, among other things, a lack of legal basis and insufficient transparency.

For agencies and B2B decision-makers

Anyone building AI agents for clients or rolling them out within their company should pull the DPA review forward into the procurement process – not only at contract signing. For agencies, clean sub-processor mapping (model, cloud, vector store, MCP, observability) is a concrete differentiator over competitors who cover "AI services" with blanket contractual wording. Blck Alpaca supports DACH companies with DPA reviews, sub-processor mapping and integration with the record of processing activities.

Note: This article serves as professional orientation and does not constitute legal advice. For a binding assessment of your specific use case, please seek qualified legal counsel. All data and provider information as of May 2026, subject to change.

FAQ

When do I need a DPA with an AI provider?
As soon as the AI provider processes personal data on your behalf under instruction and does not itself determine the purposes and essential means – i.e. acts as a processor within the meaning of Art. 4(8) GDPR. With AI agents this is practically always the case, because prompts, inference inputs and outputs, agent memory, tool-call payloads, logs and vector stores carry personal data. Most enterprise contracts (Microsoft Azure OpenAI, OpenAI Enterprise/API, Anthropic Claude for Work, Google Vertex AI, AWS Bedrock) are set up as a controller-to-processor constellation and require a DPA.
What minimum contents must a DPA under Art. 28(3) GDPR contain?
Eight points: (a) subject matter, duration, nature, purpose, data types, categories of data subjects as well as the obligations and rights of the controller; (b) processing only on documented instructions; (c) confidentiality obligation of personnel; (d) security measures under Art. 32; (e) sub-processor provisions (authorisation under Art. 28(2), notification of changes); (f) support for data subject rights; (g) cooperation on security, notification obligations and DPIA (Art. 32-36); (h) deletion or return of the data at the end of the contract as well as inspection and audit rights.
Is the assurance 'We do not train on your data' sufficient?
No, it is necessary but not sufficient. It is a contractual commitment, not a statutory default. Providers typically reserve narrow own purposes – abuse monitoring, safety review, aggregated analyses – for which they may act as controller for these limited purposes. Watch for clear-cut wording: 'We do not train' must not be blurred with 'We may use the data for safety/improvement'. In addition, EDPB Opinion 28/2024 (adopted on 17 December 2024) requires the deployer to conduct due diligence on the lawfulness of the provider's upstream training data.
Must I review every sub-processor individually?
Yes. Under Art. 28(4) the DPA chain must be complete – every sub-processor must be subject to the same data protection obligations as the main processor. A typical agent architecture comprises a five- to eight-tier cascade: foundation model, hosting/cloud, orchestration, vector store, memory provider, MCP server, observability and evaluation. The most common audit findings are undocumented MCP server flows, observability providers without a DPA and evaluation services that collect prompt traces.
What happens to data in the US – is the DPA enough?
The DPA alone is not enough for third-country transfers. In addition, a transfer basis under Chapter V GDPR is required: the EU-US Data Privacy Framework (adequacy decision of 10 July 2023, upheld by the EU General Court on 3 September 2025 in the Latombe case; appeal pending before the CJEU since 31 October 2025 – as of 2026, subject to change) or standard contractual clauses plus a transfer impact assessment. Recommendation: check DPF certification, but keep SCCs as a fallback and document a TIA, as the Garante and CNIL continued to recommend the double safeguard in 2025. For the DACH region, EU data residency and sovereign cloud options offer lower-risk alternatives.

Want to go deeper?

Get new analyses straight to your inbox – or see how we put this knowledge to work for companies.