On the hardness of evading combinations of linear classifiers

Authors:
David Stevens;Daniel Lowd
Affiliations:
University of Oregon, Eugene, OR, USA;University of Oregon, Eugene, OR, USA
Venue:
Proceedings of the 2013 ACM workshop on Artificial intelligence and security
Year:
2013

Citing 12
Cited 0

Spam!

Communications of the ACM
Using relational knowledge discovery to prevent securities fraud

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Adversarial learning

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Matplotlib: A 2D Graphics Environment

Computing in Science and Engineering
Opinion spam and analysis

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
A Multiple Instance Learning Strategy for Combating Good Word Attacks on Spam Filters

The Journal of Machine Learning Research
Multiple Classifier Systems for Adversarial Classification Tasks

MCS '09 Proceedings of the 8th International Workshop on Multiple Classifier Systems
Detecting spammers on social networks

Proceedings of the 26th Annual Computer Security Applications Conference
Detecting adversarial advertisements in the wild

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Scikit-learn: Machine Learning in Python

The Journal of Machine Learning Research
Detecting fraudulent personalities in networks of online auctioneers

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Thwarting the nigritude ultramarine: learning to identify link spam

ECML'05 Proceedings of the 16th European conference on Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

An increasing number of machine learning applications involve detecting the malicious behavior of an attacker who wishes to avoid detection. In such domains, attackers modify their behavior to evade the classifier while accomplishing their goals as efficiently as possible. The attackers typically do not know the exact classifier parameters, but they may be able to evade it by observing the classifier's behavior on test instances that they construct. For example, spammers may learn the most effective ways to modify their spams by sending test emails to accounts they control. This problem setting has been formally analyzed for linear classifiers with discrete features and convex-inducing classifiers with continuous features, but never for non-linear classifiers with discrete features. In this paper, we extend previous ACRE learning results to convex polytopes representing unions or intersections of linear classifiers. We prove that exponentially many queries are required in the worst case, but that when the features used by the component classifiers are disjoint, previous attacks on linear classifiers can be adapted to efficiently attack them. In experiments, we further analyze the cost and number of queries required to attack different types of classifiers. These results move us closer to a comprehensive understanding of the relative vulnerability of different types of classifiers to malicious adversaries.