Detecting adversarial advertisements in the wild

Authors:
D. Sculley;Matthew Eric Otey;Michael Pohl;Bridget Spitznagel;John Hainsworth;Yunkai Zhou
Affiliations:
Google, Inc, Pittsburgh, PA, USA;Google, Inc., Pittsburgh, PA, USA;Google, Inc., Pittsburgh, PA, USA;Google, inc., Pittsburgh, PA, USA;Google, Inc., Pittsburgh, PA, USA;Google, Inc., Pittsburgh, PA, USA
Venue:
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2011

Citing 24
Cited 6

Optimizing search engines using clickthrough data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Support vector machine active learning with applications to text classification

The Journal of Machine Learning Research
Latent dirichlet allocation

The Journal of Machine Learning Research
RCV1: A New Benchmark Collection for Text Categorization Research

The Journal of Machine Learning Research
Editorial: special issue on learning from imbalanced data sets

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Adversarial classification

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Solving large scale linear prediction problems using stochastic gradient descent algorithms

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Adversarial learning

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
A support vector method for multivariate performance measures

ICML '05 Proceedings of the 22nd international conference on Machine learning
Large scale genomic sequence SVM classifiers

ICML '05 Proceedings of the 22nd international conference on Machine learning
The DLT priority sampling is essentially optimal

Proceedings of the thirty-eighth annual ACM symposium on Theory of computing
Pattern Recognition and Machine Learning (Information Science and Statistics)

Pattern Recognition and Machine Learning (Information Science and Statistics)
Spam and the ongoing battle for the inbox

Communications of the ACM - Spam and the ongoing battle for the inbox
Pegasos: Primal Estimated sub-GrAdient SOlver for SVM

Proceedings of the 24th international conference on Machine learning
Priority sampling for estimation of arbitrary subset sums

Journal of the ACM (JACM)
MapReduce: simplified data processing on large clusters

Communications of the ACM - 50th anniversary issue: 1958 - 2008
Contextual advertising by combining relevance with click feedback

Proceedings of the 17th international conference on World Wide Web
Efficient projections onto the l1-ball for learning in high dimensions

Proceedings of the 25th international conference on Machine learning
Feature hashing for large scale multitask learning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Predicting bounce rates in sponsored search advertisements

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Sparse Online Learning via Truncated Gradient

The Journal of Machine Learning Research
Cheap and fast---but is it good?: evaluating non-expert annotations for natural language tasks

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Why label when you can search?: alternatives to active learning for applying human resources to build classification models under extreme class imbalance

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
A comparison of methods for multiclass support vector machines

IEEE Transactions on Neural Networks

Large-scale machine learning at twitter

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Design principles of massive, robust prediction systems

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Scaling big data mining infrastructure: the twitter experience

ACM SIGKDD Explorations Newsletter
Cost-sensitive learning for large-scale hierarchical classification

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
On the hardness of evading combinations of linear classifiers

Proceedings of the 2013 ACM workshop on Artificial intelligence and security
Approaches to adversarial drift

Proceedings of the 2013 ACM workshop on Artificial intelligence and security

Quantified Score

Hi-index	0.00

Visualization

Abstract

In a large online advertising system, adversaries may attempt to profit from the creation of low quality or harmful advertisements. In this paper, we present a large scale data mining effort that detects and blocks such adversarial advertisements for the benefit and safety of our users. Because both false positives and false negatives have high cost, our deployed system uses a tiered strategy combining automated and semi-automated methods to ensure reliable classification. We also employ strategies to address the challenges of learning from highly skewed data at scale, allocating the effort of human experts, leveraging domain expert knowledge, and independently assessing the effectiveness of our system.