Data Processing Agreements under Art. 28 GDPR with AI Providers: The DPA Guide
A data processing agreement (DPA) under Art. 28 GDPR is mandatory as soon as an AI provider processes personal data on your behalf under instruction – for example via prompts, embeddings or logs. It bindingly governs instruction-bound processing, security, sub-processors, audit rights and deletion at the end of the contract.
Key Takeaways
- ✓A DPA under Art. 28 GDPR is mandatory as soon as an AI provider processes personal data (prompts, inference inputs/outputs, agent memory, tool-call payloads, logs, vector stores) as a processor on behalf of the controller.
- ✓Art. 28(3)(a)-(h) prescribes eight minimum contents – instruction-bound processing, confidentiality, Art. 32 security, sub-processor provisions, support for data subject rights, cooperation on security/notification/DPIA, as well as deletion and audit rights.
- ✓The major providers (Microsoft, OpenAI, Anthropic, Google, AWS) offer standard DPAs with a contractual 'no training' commitment for enterprise/API data – these are contractual commitments, not statutory defaults, and they reserve narrow own purposes (abuse monitoring, safety).
- ✓Modern agents create 5- to 8-tier sub-processor cascades (model, cloud, vector store, memory, MCP server, observability) – each tier requires a complete DPA chain under Art. 28(4).
- ✓Most common audit findings: undocumented MCP flows, observability providers without a DPA, vague 'AI services' descriptions and log retention conflicts (provider 30 days vs. desired 7 days).
- ✓Breaches of Art. 28 fall within the GDPR fine framework of up to EUR 20 million or 4% of global annual turnover – the Garante proceedings against OpenAI (EUR 15 million, decision 2 November 2024) demonstrate enforcement practice.
A data processing agreement (DPA; in German "Auftragsverarbeitungsvertrag", AVV) under Art. 28 GDPR is mandatory as soon as an AI provider processes personal data on your behalf under instruction. With AI agents this is practically always the case: prompts, inference inputs and outputs, agent memory, tool-call payloads, logs and vector stores continuously carry personal references. The DPA bindingly governs what the provider may do with this data – and what it may not.
The three most important answers up front:
- The DPA obligation arises with instruction-bound processing. If the provider does not itself determine the purpose and essential means, it is a processor under Art. 4(8) GDPR – and a DPA is legally required.
- Eight minimum contents are non-negotiable. Art. 28(3)(a)-(h) defines the mandatory canon; for AI, provider-specific clauses (no training, sub-processor map, data residency, log retention) are added.
- The chain must be complete. Every sub-processor in the typical 5- to 8-tier agent cascade requires its own, equivalent binding under Art. 28(4).
Controller or processor? Classifying the AI provider
Under Art. 4(7)/(8) GDPR and EDPB Guidelines 7/2020, the controller determines the purposes and essential means of the processing; the processor acts on documented instructions. Classifying an AI provider hinges on three questions:
- Who determines the purposes? If the provider uses prompts to improve its own models, it slips into the role of controller for this purpose.
- Who determines the essential means? The processor may choose technical means; the essential means remain with the controller.
- Practical anchor points: use of training data, selection of sub-processors, retention defaults, access for abuse monitoring.
Most enterprise contracts (Microsoft Azure OpenAI, OpenAI Enterprise/API, Anthropic Claude for Work/Enterprise, Google Vertex AI, AWS Bedrock, Mistral, Aleph Alpha) are set up as a controller-to-processor constellation – with an explicit commitment not to train on customer data. Important: these are contractual commitments, not statutory minimum standards. Providers typically reserve narrow rights (abuse monitoring, safety review, aggregated analyses) for which they may act as controller themselves for these limited purposes.
The eight mandatory contents under Art. 28(3) GDPR
An effective DPA must govern the following points:
- (a) Subject matter, duration, nature, purpose of the processing, data types, categories of data subjects as well as the obligations and rights of the controller.
- (b) Processing exclusively on documented instructions.
- (c) Confidentiality obligation of the personnel deployed.
- (d) Security measures under Art. 32.
- (e) Sub-processor provisions (authorisation under Art. 28(2), notification of changes).
- (f) Support in fulfilling data subject rights.
- (g) Cooperation on security, data breach notification and DPIA (Art. 32-36).
- (h) Deletion or return of the data at the end of the service as well as inspection and audit rights.
For AI providers, the following clauses must additionally be negotiated: no training/fine-tuning of the provider's models on customer data without explicit consent; no human review of prompts/outputs except via defined safety/abuse paths with logged access; a sub-processor map with named entities, location, function and sub-sub-processors; data residency commitments (EU Data Boundary, Swiss-DPF); configurable log retention including zero-data-retention modes; notification of changes to the training policy with opt-out; as well as liability for sub-processor breaches pursuant to Art. 28(4).
Instruction-bound processing, sub-processors and audit rights
The instruction-bound processing ((b)) is the core: the provider may only process personal data as you instruct in documented form. Any unilateral change of purpose – for example model improvement from your prompts – leaves the scope of processing on behalf.
The greatest practical effort arises with the sub-processors. A modern agent deployment typically comprises a five- to eight-tier cascade:
- Foundation model providers (OpenAI, Anthropic, Mistral, Aleph Alpha, Google, Cohere)
- Hosting/cloud providers (Azure, AWS, GCP, IONOS, STACKIT, Swisscom, Open Telekom Cloud)
- Orchestration/agent framework runtime
- Vector store/RAG infrastructure (Pinecone, Weaviate, Qdrant, Milvus, pgvector)
- Memory providers (mem0, Letta, in-house)
- MCP servers (each external server is, depending on the data flow, a processor or controller)
- Observability (Langfuse, LangSmith, Helicone, Datadog)
- Evaluation/red-team services
For each tier, the deployer must verify that a DPA exists, that the chain is unbroken under Art. 28(4) and that the data residency commitments are consistent. The most common audit findings: undocumented MCP server flows, observability providers without a DPA and evaluation services that collect prompt traces.
With the audit rights ((h)), pure "summary-report-only" patterns are insufficient as soon as high-risk data is involved. The BfDI, BayLDA, HmbBfDI and NRW LDI routinely require the DPA plus the specific configuration of telemetry, training opt-out and region settings.
Status of major providers' DPAs (as of May 2026)
The following table is a research summary as of 14 May 2026, not legal advice; provider terms change frequently and must be re-validated when concluding a contract.
Provider | Standard DPA | "No training on customer data"? | Data residency EU/CH | Retention default |
|---|---|---|---|---|
Microsoft 365 Copilot / Azure OpenAI | Microsoft Products and Services DPA | Yes, contractually: prompts, completions, embeddings, fine-tuning data not used for model training without explicit instruction; Anthropic models in M365 Copilot (from Jan 2026) excluded from the EU Data Boundary | EU Data Boundary commitment (in-region storage/processing); fine-tuning possibly cross-region | Configurable; in-tenant default; abuse monitoring logs up to 30 days unless opted out |
OpenAI API (Enterprise/Business/API) | OpenAI DPA (effective from 1 Jan 2026; OpenAI Ireland Ltd as EEA/CH contracting party) | Yes: API and ChatGPT Enterprise/Team customer data not used for training by default | EU data residency for eligible enterprise/edu accounts and certain API setups; otherwise global routing | API data max. 30 days; enterprise for the contract duration; zero data retention available contractually |
Anthropic Claude for Work/Enterprise/API | Anthropic Commercial Terms + DPA | Yes: commercial inputs/outputs not used for training by default (consumer tier differs, not enterprise) | Direct API "us"/"global"; EU residency via AWS Bedrock EU or Google Vertex AI EU | API logs: 7 days (since 14 Sep 2025, previously 30); ZDR via DPA |
Google Vertex AI (Gemini, Claude on Vertex) | Google Cloud DPA + Vertex AI Terms | Yes: customer data not used for training Google or third-party models | EU regional endpoints (10 EU regions for Claude on Vertex); region guarantee | Configurable; short default |
AWS Bedrock (Anthropic, Meta, Mistral, Cohere, Amazon) | AWS GDPR DPA + Service Terms | Yes: prompts/outputs not used for provider model training; models isolated from provider services | EU regions; European Sovereign Cloud (eusc-de-east-1, Jan 2026) – Claude not yet available there | Configurable; Bedrock does not log prompts/outputs by default |
Sovereign alternatives governed by German/Swiss DPA law (Aleph Alpha Pharia, IONOS AI Model Hub, STACKIT, T-Systems Open Telekom Cloud, Swisscom Sovereign AI) offer exclusive DE/EU or CH processing and customer-controlled retention. DeepSeek is not recommended for regulated DACH sectors due to PRC jurisdiction and a Garante information request (January 2025).
Common DPA mistakes in AI projects
- Vague processing descriptions ("AI services" instead of specific models, prompts, embeddings, fine-tuning).
- Out-of-date sub-processor lists – MCP servers and observability providers are missing.
- Weak audit rights for high-risk data.
- Training data ambiguity – "We do not train on your data" is blurred with "We may use it for safety/improvement".
- Log retention conflicts – provider retention exceeds your own policy (e.g. 30 days vs. desired 7 days).
- Cross-border routing without explicit residency determination.
- Missing upstream due diligence contrary to EDPB Opinion 28/2024 (review of the lawfulness of the upstream training data).
DPA review checklist
Worked example with figures: Mid-sized Copilot rollout
A company with 800 employees introduces Microsoft 365 Copilot and a customer service chatbot on Azure OpenAI. This results in at least two main DPAs (Microsoft, OpenAI/Azure) plus a sub-processor chain. Specifically to review: Microsoft Products and Services DPA in force, the Anthropic-as-sub-processor flow documented (Anthropic excluded from the EU Data Boundary since early 2026 – opt-in/opt-out by the deployer); the Azure OpenAI sub-processor list (as of April 2025) integrated into the record of processing activities; all MCP/connector flows mapped; abuse monitoring logs limited to a maximum of 30 days, OpenAI API data to 30 days, Anthropic API logs to 7 days. The risk of failure: breaches of Art. 28 fall within the GDPR fine framework of up to EUR 20 million or 4% of global annual turnover. At EUR 50 million in turnover, this would mathematically amount to up to EUR 2 million. That the supervisory authorities take action is shown by the Garante proceedings against OpenAI: a EUR 15 million fine (decision 2 November 2024, published 20 December 2024) for, among other things, a lack of legal basis and insufficient transparency.
For agencies and B2B decision-makers
Anyone building AI agents for clients or rolling them out within their company should pull the DPA review forward into the procurement process – not only at contract signing. For agencies, clean sub-processor mapping (model, cloud, vector store, MCP, observability) is a concrete differentiator over competitors who cover "AI services" with blanket contractual wording. Blck Alpaca supports DACH companies with DPA reviews, sub-processor mapping and integration with the record of processing activities.
Note: This article serves as professional orientation and does not constitute legal advice. For a binding assessment of your specific use case, please seek qualified legal counsel. All data and provider information as of May 2026, subject to change.
FAQ
When do I need a DPA with an AI provider?
What minimum contents must a DPA under Art. 28(3) GDPR contain?
Is the assurance 'We do not train on your data' sufficient?
Must I review every sub-processor individually?
What happens to data in the US – is the DPA enough?
Want to go deeper?
Get new analyses straight to your inbox – or see how we put this knowledge to work for companies.