A robust ranking methodology based on diverse calibration of AdaBoost

Authors:
Róbert Busa-Fekete;Balázs Kégl;Tamás Éltetö;György Szarvas
Affiliations:
Linear Accelerator Laboratory, University of Paris-Sud, CNRS Orsay, France and Research Group on Artificial Intelligence of the Hungarian Academy of Sciences and University of Szeged, Szeged, Hung ...;Linear Accelerator Laboratory, University of Paris-Sud, CNRS and Computer Science Laboratory, University of Paris-Sud, CNRS and INRIA-Saclay, Orsay, France;Computer Science Laboratory, University of Paris-Sud, CNRS and INRIA-Saclay, Orsay, France;Research Group on Artificial Intelligence of the Hungarian Academy of Sciences and University of Szeged, Szeged, Hungary and Computer Science Department, Technische Universität Darmstadt, Dar ...
Venue:
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
Year:
2011

Citing 13
Cited 1

A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Improved Boosting Algorithms Using Confidence-rated Predictions

Machine Learning - The Eleventh Annual Conference on computational Learning Theory
An efficient boosting algorithm for combining preferences

The Journal of Machine Learning Research
Prediction, Learning, and Games

Prediction, Learning, and Games
Learning to rank: from pairwise approach to listwise approach

Proceedings of the 24th international conference on Machine learning
AdaRank: a boosting algorithm for information retrieval

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Evidence Contrary to the Statistical View of Boosting

The Journal of Machine Learning Research
Boosting products of base classifiers

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Expected reciprocal rank for graded relevance

Proceedings of the 18th ACM conference on Information and knowledge management
The Probabilistic Relevance Framework: BM25 and Beyond

Foundations and Trends in Information Retrieval
Gradient descent optimization of smoothed information retrieval metrics

Information Retrieval
Adapting boosting for information retrieval measures

Information Retrieval
Statistical Analysis of Bayes Optimal Subset Ranking

IEEE Transactions on Information Theory

Tune and mix: learning to rank using ensembles of calibrated multi-class classifiers

Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

In subset ranking, the goal is to learn a ranking function that approximates a gold standard partial ordering of a set of objects (in our case, relevance labels of a set of documents retrieved for the same query). In this paper we introduce a learning to rank approach to subset ranking based on multi-class classification. Our technique can be summarized in three major steps. First, a multi-class classification model (AdaBoost.MH) is trained to predict the relevance label of each object. Second, the trained model is calibrated using various calibration techniques to obtain diverse class probability estimates. Finally, the Bayes-scoring function (which optimizes the popular Information Retrieval performance measure NDCG), is approximated through mixing these estimates into an ultimate scoring function. An important novelty of our approach is that many different methods are applied to estimate the same probability distribution, and all these hypotheses are combined into an improved model. It is well known that mixing different conditional distributions according to a prior is usually more efficient than selecting one "optimal" distribution. Accordingly, using all the calibration techniques, our approach does not require the estimation of the best suited calibration method and is therefore less prone to overfitting. In an experimental study, our method outperformed many standard ranking algorithms on the LETOR benchmark datasets, most of which are based on significantly more complex learning to rank algorithms than ours.