Tune and mix: learning to rank using ensembles of calibrated multi-class classifiers

Authors:
Róbert Busa-Fekete;Balázs Kégl;Tamás Éltető;György Szarvas
Affiliations:
Linear Accelerator Laboratory (LAL), University of Paris-Sud, CNRS, Orsay, France 91898 and Research Group on Artificial Intelligence (RGAI) of the Hungarian Academy of Sciences and University of ...;Linear Accelerator Laboratory (LAL) and Computer Science Laboratory (LRI), University of Paris-Sud, CNRS, Orsay, France 91898;Ericsson Hungary, Budapest, Hungary 1097;Nuance Communications Germany GmbH, Aachen, Germany 52072
Venue:
Machine Learning
Year:
2013

Citing 32
Cited 0

C4.5: programs for machine learning

C4.5: programs for machine learning
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Improved Boosting Algorithms Using Confidence-rated Predictions

Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Cumulated gain-based evaluation of IR techniques

ACM Transactions on Information Systems (TOIS)
The Nonstochastic Multiarmed Bandit Problem

SIAM Journal on Computing
Stochastic gradient boosting

Computational Statistics & Data Analysis - Nonlinear methods and data mining
Using graded relevance assessments in IR evaluation

Journal of the American Society for Information Science and Technology
Gambling in a rigged casino: The adversarial multi-armed bandit problem

FOCS '95 Proceedings of the 36th Annual Symposium on Foundations of Computer Science
On the algorithmic implementation of multiclass kernel-based vector machines

The Journal of Machine Learning Research
An efficient boosting algorithm for combining preferences

The Journal of Machine Learning Research
Probability Estimates for Multi-class Classification by Pairwise Coupling

The Journal of Machine Learning Research
Learning to rank using gradient descent

ICML '05 Proceedings of the 22nd international conference on Machine learning
New approaches to support vector ordinal regression

ICML '05 Proceedings of the 22nd international conference on Machine learning
Prediction, Learning, and Games

Prediction, Learning, and Games
Training linear SVMs in linear time

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
The Minimum Description Length Principle (Adaptive Computation and Machine Learning)

The Minimum Description Length Principle (Adaptive Computation and Machine Learning)
On the reliability of information retrieval metrics based on graded relevance

Information Processing and Management: an International Journal - Special issue: AIRS2005: Information retrieval research in Asia
Boosted Classification Trees and Class Probability/Quantile Estimation

The Journal of Machine Learning Research
Linear feature-based models for information retrieval

Information Retrieval
Learning to rank: from pairwise approach to listwise approach

Proceedings of the 24th international conference on Machine learning
AdaRank: a boosting algorithm for information retrieval

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Exploration-exploitation tradeoff using variance estimates in multi-armed bandits

Theoretical Computer Science
Boosting products of base classifiers

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Expected reciprocal rank for graded relevance

Proceedings of the 18th ACM conference on Information and knowledge management
The Probabilistic Relevance Framework: BM25 and Beyond

Foundations and Trends in Information Retrieval
Early exit optimizations for additive machine learned ranking systems

Proceedings of the third ACM international conference on Web search and data mining
Gradient descent optimization of smoothed information retrieval metrics

Information Retrieval
Adapting boosting for information retrieval measures

Information Retrieval
A robust ranking methodology based on diverse calibration of AdaBoost

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
Datum-wise classification: a sequential approach to sparsity

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
MULTIBOOST: a multi-purpose boosting package

The Journal of Machine Learning Research
Statistical Analysis of Bayes Optimal Subset Ranking

IEEE Transactions on Information Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

In subset ranking, the goal is to learn a ranking function that approximates a gold standard partial ordering of a set of objects (in our case, a set of documents retrieved for the same query). The partial ordering is given by relevance labels representing the relevance of documents with respect to the query on an absolute scale. Our approach consists of three simple steps. First, we train standard multi-class classifiers (AdaBoost.MH and multi-class SVM) to discriminate between the relevance labels. Second, the posteriors of multi-class classifiers are calibrated using probabilistic and regression losses in order to estimate the Bayes-scoring function which optimizes the Normalized Discounted Cumulative Gain (NDCG). In the third step, instead of selecting the best multi-class hyperparameters and the best calibration, we mix all the learned models in a simple ensemble scheme.Our extensive experimental study is itself a substantial contribution. We compare most of the existing learning-to-rank techniques on all of the available large-scale benchmark data sets using a standardized implementation of the NDCG score. We show that our approach is competitive with conceptually more complex listwise and pairwise methods, and clearly outperforms them as the data size grows. As a technical contribution, we clarify some of the confusing results related to the ambiguities of the evaluation tools, and propose guidelines for future studies.