Certified and Forensic Defenses against Poisoning and Backdoor Attacks
Loading...
Date
2024-03-25
Authors
Hammoudeh, Zayd
Journal Title
Journal ISSN
Volume Title
Publisher
University of Oregon
Abstract
Data poisoning and backdoor attacks manipulate model predictions by inserting malicious instances into the training set. Most existing defenses against poisoning and backdoor attacks are empirical and easily evaded by an adaptive attacker. In addition, existing empirical defenses provide, at best, minimal insights into an attacker's identity, goals, and methods. In contrast, this work proposes two classes of poisoning and backdoor defenses: (1) certified defenses, which provide provable guarantees on their robustness and (2) forensic defenses, which provide actionable, human-interpretable insights into an attack's goals so as to stop the attack via intervention outside the ML system. We focus on certified defenses for regression, where the model predicts a continuous value, and sparse (L0) attacks, where the adversary controls an unknown subset of the training and test features. Our forensic defense identifies the target of poisoning and backdoor attacks while simultaneously mitigating the attack; we validate our forensic defense on a wide range of data modalities, including speech, text, and vision.
This dissertation includes previously published and unpublished coauthored material.
Description
Keywords
Adversarial Robustness, Backdoor Attack, Certified Robustness, Data Poisoning, Evasion Attack, Training Data Attribution