Data Leakage and Model Memorisation
Modern large language models and deep neural networks have a well-documented tendency to memorise specific examples from their training data. This is called model memorisation, and it creates a direct path from your model to your training data โ even without any adversarial activity.
What Is Model Memorisation?
When a model is trained, it is supposed to learn general patterns, not specific examples. But in practice, especially with large models and small datasets, models often retain verbatim fragments of training data that can be extracted by querying the model with the right prompts.
Large language models trained on proprietary documents โ contracts, financial records, HR data, customer correspondence โ may reproduce verbatim passages when prompted cleverly. This has been demonstrated against GPT-2, GPT-3, and several open-source models trained on private corpora. It is not hypothetical.
Assessing Your Exposure
- โWhat data did you fine-tune your LLMs on? Does it include PII, commercially sensitive content, or privileged information?
- โHave you tested whether your deployed models can be prompted to reproduce training content?
- โDo your AI acceptable use policies address what data employees may use to fine-tune models?
- โIf you use a third-party AI provider, can you verify what your data is trained on?
Minimising Memorisation Risk
- โDon't fine-tune on sensitive data unless necessary โ Pre-trained general models with context injection (RAG) are often a safer alternative
- โDifferential privacy during fine-tuning โ Provides mathematical guarantees that individual records cannot be extracted
- โRegular red-team testing โ Explicitly attempt to extract training data using prompt injection and other techniques before deployment
- โData minimisation โ Train on the minimum data needed; data you don't train on cannot be memorised
If your model memorises personal data, that data is effectively stored inside the model weights โ and you may have no ability to delete it. This creates direct tension with GDPR's right to erasure. The ICO's guidance on AI and data protection explicitly addresses this and recommends differential privacy as a primary mitigation.
