Influence Based Defence Against Data Poisoning Attacks In Online Learning
Main Article Content
Abstract
Data poisoning is a type of adversarial attack on training data where an attacker manipulates a fraction of data to degrade
the performance of machine learning model. Therefore, applications that rely on external data-sources for training data
are at a significantly higher risk. There are several known defensive mechanisms that can help in mitigating the threat
from such attacks. For example, data sanitization is a popular defensive mechanism where in the learner rejects those data
points that are sufficiently far from the set of training instances. Prior work on data poisoning defense primarily focused
on offline setting, wherein all the data is assumed to be available for analysis. Defensive measures for online learning,
where data points arrive sequentially, have not garnered similar interest. In this work, the system proposes a defense
mechanism to minimize the degradation caused by the poisoned training data on a learner’s model in an online setup. Our
proposed method utilizes an influence function which is a classic technique in robust statistics. Further, we supplement it
with the existing data sanitization methods for filtering out some of the poisoned data points. We study the effectiveness
of our defense mechanism on multiple datasets and across multiple attack strategies against an online learner