1.3 ยท Data Leakage, Poisoning & Supply Chain Risk

Training Data Poisoning โ€” The Hidden Threat

โฑ 12 minCourse 01

Data poisoning is an attack on the training process itself. An adversary introduces carefully crafted malicious examples into your training data, causing the resulting model to behave in a way the attacker controls โ€” while appearing completely normal to standard evaluation.

How Poisoning Works

Training data poisoning can take two forms:

  • โ—†Integrity attacks โ€” Corrupting existing training examples to cause misclassification on specific inputs. The model learns a "backdoor" โ€” it performs normally on standard inputs but fails in a predictable way when it encounters a trigger the attacker controls.
  • โ—†Availability attacks โ€” Injecting enough noise into training data to degrade overall model performance. Less targeted but often easier to execute.

Who Controls Your Training Data?

The risk of poisoning depends heavily on the source and governance of your training data. Ask yourself:

  • โ—†Is your training data collected from user submissions, web scraping, or third-party sources where adversaries could insert content?
  • โ—†Do you have a validation pipeline that checks training data quality and integrity before it enters the training process?
  • โ—†Can you audit your training data and trace each record's origin?
  • โ—†If you discover poisoned data, can you identify and remove it, retrain cleanly, and verify the remediation worked?
โš  The Crowdsourced Data Risk

Models trained on user-generated content or open web data are particularly vulnerable. Any public-facing input mechanism โ€” reviews, ratings, search queries, image uploads โ€” is a potential poisoning vector if that data feeds back into model training.

Defending Against Poisoning

  • โ—†Data provenance tracking โ€” Know where every training record came from
  • โ—†Anomaly detection on training batches โ€” Flag statistical outliers before they enter training
  • โ—†Holdout validation on clean data โ€” Maintain a trusted, locked validation set separate from training pipelines
  • โ—†Canary inputs โ€” Synthetic test inputs you know the correct answer to; if the model fails on canaries, something is wrong
  • โ—†Staged rollout with monitoring โ€” Never deploy a newly trained model directly to full production without a period of monitored operation
โœ“ Key Takeaway

Data governance is AI security governance. Every control you apply to production data โ€” access control, audit logging, integrity checks โ€” should also apply to your training data pipelines. Most organisations have a large gap here.