C4.5: programs for machine learning
C4.5: programs for machine learning
A decision-theoretic generalization of on-line learning and an application to boosting
Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Improved Boosting Algorithms Using Confidence-rated Predictions
Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Cumulated gain-based evaluation of IR techniques
ACM Transactions on Information Systems (TOIS)
The Nonstochastic Multiarmed Bandit Problem
SIAM Journal on Computing
Computational Statistics & Data Analysis - Nonlinear methods and data mining
Using graded relevance assessments in IR evaluation
Journal of the American Society for Information Science and Technology
Gambling in a rigged casino: The adversarial multi-armed bandit problem
FOCS '95 Proceedings of the 36th Annual Symposium on Foundations of Computer Science
On the algorithmic implementation of multiclass kernel-based vector machines
The Journal of Machine Learning Research
An efficient boosting algorithm for combining preferences
The Journal of Machine Learning Research
Probability Estimates for Multi-class Classification by Pairwise Coupling
The Journal of Machine Learning Research
Learning to rank using gradient descent
ICML '05 Proceedings of the 22nd international conference on Machine learning
New approaches to support vector ordinal regression
ICML '05 Proceedings of the 22nd international conference on Machine learning
Prediction, Learning, and Games
Prediction, Learning, and Games
Training linear SVMs in linear time
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
The Minimum Description Length Principle (Adaptive Computation and Machine Learning)
The Minimum Description Length Principle (Adaptive Computation and Machine Learning)
On the reliability of information retrieval metrics based on graded relevance
Information Processing and Management: an International Journal - Special issue: AIRS2005: Information retrieval research in Asia
Boosted Classification Trees and Class Probability/Quantile Estimation
The Journal of Machine Learning Research
Linear feature-based models for information retrieval
Information Retrieval
Learning to rank: from pairwise approach to listwise approach
Proceedings of the 24th international conference on Machine learning
AdaRank: a boosting algorithm for information retrieval
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Exploration-exploitation tradeoff using variance estimates in multi-armed bandits
Theoretical Computer Science
Boosting products of base classifiers
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Expected reciprocal rank for graded relevance
Proceedings of the 18th ACM conference on Information and knowledge management
The Probabilistic Relevance Framework: BM25 and Beyond
Foundations and Trends in Information Retrieval
Early exit optimizations for additive machine learned ranking systems
Proceedings of the third ACM international conference on Web search and data mining
Gradient descent optimization of smoothed information retrieval metrics
Information Retrieval
Adapting boosting for information retrieval measures
Information Retrieval
A robust ranking methodology based on diverse calibration of AdaBoost
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
Datum-wise classification: a sequential approach to sparsity
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
MULTIBOOST: a multi-purpose boosting package
The Journal of Machine Learning Research
Statistical Analysis of Bayes Optimal Subset Ranking
IEEE Transactions on Information Theory
Hi-index | 0.00 |
In subset ranking, the goal is to learn a ranking function that approximates a gold standard partial ordering of a set of objects (in our case, a set of documents retrieved for the same query). The partial ordering is given by relevance labels representing the relevance of documents with respect to the query on an absolute scale. Our approach consists of three simple steps. First, we train standard multi-class classifiers (AdaBoost.MH and multi-class SVM) to discriminate between the relevance labels. Second, the posteriors of multi-class classifiers are calibrated using probabilistic and regression losses in order to estimate the Bayes-scoring function which optimizes the Normalized Discounted Cumulative Gain (NDCG). In the third step, instead of selecting the best multi-class hyperparameters and the best calibration, we mix all the learned models in a simple ensemble scheme.Our extensive experimental study is itself a substantial contribution. We compare most of the existing learning-to-rank techniques on all of the available large-scale benchmark data sets using a standardized implementation of the NDCG score. We show that our approach is competitive with conceptually more complex listwise and pairwise methods, and clearly outperforms them as the data size grows. As a technical contribution, we clarify some of the confusing results related to the ambiguities of the evaluation tools, and propose guidelines for future studies.