Multi-label learning with millions of labels: recommending advertiser bid phrases for web pages

Authors:
Rahul Agrawal;Archit Gupta;Yashoteja Prabhu;Manik Varma
Affiliations:
Microsoft AdCenter, Bangalore, India;Indian Institude of Technology Delhi, New Delhi, India;Micrososft Research, Bangalore, India;Micrososft Research, Bangalore, India
Venue:
Proceedings of the 22nd international conference on World Wide Web
Year:
2013

Citing 31
Cited 0

Random Forests

Machine Learning
Top-Down Induction of Clustering Trees

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Knowledge Discovery in Multi-label Phenotype Data

PKDD '01 Proceedings of the 5th European Conference on Principles of Data Mining and Knowledge Discovery
Large Margin Methods for Structured and Interdependent Output Variables

The Journal of Machine Learning Research
Finding advertising keywords on web pages

Proceedings of the 15th international conference on World Wide Web
Incremental Algorithms for Hierarchical Classification

The Journal of Machine Learning Research
Kernel-Based Learning of Hierarchical Multilabel Classification Models

The Journal of Machine Learning Research
MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Keyword Generation for Search Engine Advertising

ICDMW '06 Proceedings of the Sixth IEEE International Conference on Data Mining - Workshops
Model-shared subspace boosting for multi-label classification

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Keyword generation for search engine advertising using semantic similarity between terms

Proceedings of the ninth international conference on Electronic commerce
Keyword extraction for contextual advertisement

Proceedings of the 17th international conference on World Wide Web
Optimizing query rewrites for keyword-based advertising

Proceedings of the 9th ACM conference on Electronic commerce
Random k-Labelsets: An Ensemble Method for Multilabel Classification

ECML '07 Proceedings of the 18th European conference on Machine Learning
Ensembles of Multi-Objective Decision Trees

ECML '07 Proceedings of the 18th European conference on Machine Learning
Simrank++: query rewriting through link analysis of the click graph

Proceedings of the VLDB Endowment
One-Class Collaborative Filtering

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Large scale multi-label classification via metalabeler

Proceedings of the 18th international conference on World wide web
Online expansion of rare queries for sponsored search

Proceedings of the 18th international conference on World wide web
Feature selection for multi-label naive Bayes classification

Information Sciences: an International Journal
Semi-supervised multi-label learning by constrained non-negative matrix factorization

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
PLANET: massively parallel learning of tree ensembles with MapReduce

Proceedings of the VLDB Endowment
Automatic generation of bid phrases for online advertising

Proceedings of the third ACM international conference on Web search and data mining
Using landing pages for sponsored search ad selection

Proceedings of the 19th international conference on World wide web
Conditional probability tree estimation analysis and algorithms

UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Large scale image annotation: learning to rank with joint word-image embeddings

Machine Learning
One-Class Matrix Completion with Low-Density Factorizations

ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
Sparse Semi-supervised Learning Using Conjugate Functions

The Journal of Machine Learning Research
A Survey of Automatic Query Expansion in Information Retrieval

ACM Computing Surveys (CSUR)
Trading Accuracy for Sparsity in Optimization Problems with Sparsity Constraints

SIAM Journal on Optimization
Efficient max-margin multi-label classification with applications to zero-shot learning

Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recommending phrases from web pages for advertisers to bid on against search engine queries is an important research problem with direct commercial impact. Most approaches have found it infeasible to determine the relevance of all possible queries to a given ad landing page and have focussed on making recommendations from a small set of phrases extracted (and expanded) from the page using NLP and ranking based techniques. In this paper, we eschew this paradigm, and demonstrate that it is possible to efficiently predict the relevant subset of queries from a large set of monetizable ones by posing the problem as a multi-label learning task with each query being represented by a separate label. We develop Multi-label Random Forests to tackle problems with millions of labels. Our proposed classifier has prediction costs that are logarithmic in the number of labels and can make predictions in a few milliseconds using 10 Gb of RAM. We demonstrate that it is possible to generate training data for our classifier automatically from click logs without any human annotation or intervention. We train our classifier on tens of millions of labels, features and training points in less than two days on a thousand node cluster. We develop a sparse semi-supervised multi-label learning formulation to deal with training set biases and noisy labels harvested automatically from the click logs. This formulation is used to infer a belief in the state of each label for each training ad and the random forest classifier is extended to train on these beliefs rather than the given labels. Experiments reveal significant gains over ranking and NLP based techniques on a large test set of 5 million ads using multiple metrics.