Automated Attacks on Compression-Based Classifiers

dc.contributor.advisorLowd, Danielen_US
dc.contributor.authorBurago, Igoren_US
dc.date.accessioned2014-09-29T17:54:57Z
dc.date.available2014-09-29T17:54:57Z
dc.date.issued2014-09-29
dc.description.abstractMethods of compression-based text classification have proven their usefulness for various applications. However, in some classification problems, such as spam filtering, a classifier confronts one or many adversaries willing to induce errors in the classifier's judgment on certain kinds of input. In this thesis, we consider the problem of finding thrifty strategies for character-based text modification that allow an adversary to revert classifier's verdict on a given family of input texts. We propose three statistical statements of the problem that can be used by an attacker to obtain transformation models which are optimal in some sense. Evaluating these three techniques on a realistic spam corpus, we find that an adversary can transform a spam message (detectable as such by an entropy-based text classifier) into a legitimate one by generating and appending, in some cases, as few additional characters as 20% of the original length of the message.en_US
dc.identifier.urihttps://hdl.handle.net/1794/18439
dc.language.isoen_USen_US
dc.publisherUniversity of Oregonen_US
dc.rightsAll Rights Reserved.en_US
dc.subjectAdversarial machine learningen_US
dc.subjectCompression-based classificationen_US
dc.titleAutomated Attacks on Compression-Based Classifiersen_US
dc.typeElectronic Thesis or Dissertationen_US
thesis.degree.disciplineDepartment of Computer and Information Scienceen_US
thesis.degree.grantorUniversity of Oregonen_US
thesis.degree.levelmastersen_US
thesis.degree.nameM.S.en_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Burago_oregon_0171N_11060.pdf
Size:
234.96 KB
Format:
Adobe Portable Document Format