2.2 · Privacy, GDPR & AI

Your AI Has a Data Problem

12 minCourse 02

Every AI system that processes personal data is a GDPR system. This is not a metaphor or an edge case — it is a direct legal consequence of how GDPR defines personal data and how AI systems work. Understanding this intersection is not optional for any organisation operating AI in the UK or EU.

The Core Tension

GDPR was designed around principles that AI fundamentally challenges: purpose limitation, data minimisation, transparency, and the right to erasure. AI systems tend to consume large amounts of data, use it for multiple purposes, operate opaquely, and retain the influence of that data in model weights indefinitely.

Where GDPR Applies to AI

GDPR applies whenever an AI system:

  • Is trained on personal data — names, emails, behavioural patterns, purchase history, location data, biometric data
  • Takes personal data as input at inference time — a fraud detection model that receives transaction data tied to an individual
  • Produces outputs that relate to individuals — a credit score, a health risk prediction, a recommendation linked to a user profile
  • Is used to make or assist in decisions about individuals — approval or rejection of applications, content prioritisation, pricing

The Legal Basis Problem

Every processing activity under GDPR requires a lawful basis. The six available bases are: consent, contract, legal obligation, vital interests, public task, and legitimate interests. For AI systems, the most commonly relied upon are consent, contract, and legitimate interests — but each creates its own complications.

  • Consent must be freely given, specific, informed, and unambiguous. For AI systems that process data in complex ways, obtaining genuinely informed consent is extremely difficult.
  • Contract only covers processing strictly necessary to perform a contract. Using customer data to train a general-purpose model is rarely justified by contract.
  • Legitimate interests requires a balancing test — your interests must not override individuals' rights. For high-risk AI processing, this is rarely a robust basis.
The Training Data Trap

Using historical customer data to train AI models is one of the most common GDPR violations in enterprise AI. The data was collected under one purpose — say, processing a transaction — and is now being used for a different purpose: training a model. This requires a fresh legal basis or a formal compatibility assessment. Most organisations skip this step.

The Key Shift

Start thinking of every AI system as a data controller or processor — because legally, that's what it is. Every input it receives, every output it produces, and every record it was trained on needs to be accounted for in your GDPR framework.