1.3 ยท Data Leakage, Poisoning & Supply Chain Risk

Data Leakage and Model Memorisation

โฑ 11 minCourse 01

Modern large language models and deep neural networks have a well-documented tendency to memorise specific examples from their training data. This is called model memorisation, and it creates a direct path from your model to your training data โ€” even without any adversarial activity.

What Is Model Memorisation?

When a model is trained, it is supposed to learn general patterns, not specific examples. But in practice, especially with large models and small datasets, models often retain verbatim fragments of training data that can be extracted by querying the model with the right prompts.

The LLM Risk

Large language models trained on proprietary documents โ€” contracts, financial records, HR data, customer correspondence โ€” may reproduce verbatim passages when prompted cleverly. This has been demonstrated against GPT-2, GPT-3, and several open-source models trained on private corpora. It is not hypothetical.

Assessing Your Exposure

  • โ—†What data did you fine-tune your LLMs on? Does it include PII, commercially sensitive content, or privileged information?
  • โ—†Have you tested whether your deployed models can be prompted to reproduce training content?
  • โ—†Do your AI acceptable use policies address what data employees may use to fine-tune models?
  • โ—†If you use a third-party AI provider, can you verify what your data is trained on?

Minimising Memorisation Risk

  • โ—†Don't fine-tune on sensitive data unless necessary โ€” Pre-trained general models with context injection (RAG) are often a safer alternative
  • โ—†Differential privacy during fine-tuning โ€” Provides mathematical guarantees that individual records cannot be extracted
  • โ—†Regular red-team testing โ€” Explicitly attempt to extract training data using prompt injection and other techniques before deployment
  • โ—†Data minimisation โ€” Train on the minimum data needed; data you don't train on cannot be memorised
โœ“ GDPR Connection

If your model memorises personal data, that data is effectively stored inside the model weights โ€” and you may have no ability to delete it. This creates direct tension with GDPR's right to erasure. The ICO's guidance on AI and data protection explicitly addresses this and recommends differential privacy as a primary mitigation.