Training efficient tree-based models for document ranking

Authors:
Nima Asadi;Jimmy Lin
Affiliations:
Dept. of Computer Science, University of Maryland, College Park and Institute for Advanced Computer Studies, University of Maryland, College Park;Dept. of Computer Science, University of Maryland, College Park and Institute for Advanced Computer Studies, University of Maryland, College Park and The iSchool, University of Maryland, College P ...
Venue:
ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Year:
2013

Citing 13
Cited 0

Bagging predictors

Machine Learning
Random Forests

Machine Learning
Cumulated gain-based evaluation of IR techniques

ACM Transactions on Information Systems (TOIS)
Pruning Adaptive Boosting

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
An Analysis of Ensemble Pruning Techniques Based on Ordered Aggregation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Stochastic gradient boosted distributed decision trees

Proceedings of the 18th ACM conference on Information and knowledge management
PLANET: massively parallel learning of tree ensembles with MapReduce

Proceedings of the VLDB Endowment
Early exit optimizations for additive machine learned ranking systems

Proceedings of the third ACM international conference on Web search and data mining
Parallel boosted regression trees for web search ranking

Proceedings of the 20th international conference on World wide web
Bagging gradient-boosted trees for high precision, low variance ranking models

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
A cascade ranking model for efficient ranked retrieval

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Learning to Rank for Information Retrieval and Natural Language Processing

Learning to Rank for Information Retrieval and Natural Language Processing
Tree ensembles for learning to rank

Tree ensembles for learning to rank

Quantified Score

Hi-index	0.00

Visualization

Abstract

Gradient-boosted regression trees (GBRTs) have proven to be an effective solution to the learning-to-rank problem. This work proposes and evaluates techniques for training GBRTs that have efficient runtime characteristics. Our approach is based on the simple idea that compact, shallow, and balanced trees yield faster predictions: thus, it makes sense to incorporate some notion of execution cost during training to "encourage" trees with these topological characteristics. We propose two strategies for accomplishing this: the first, by directly modifying the node splitting criterion during tree induction, and the second, by stagewise tree pruning. Experiments on a standard learning-to-rank dataset show that the pruning approach is superior; one balanced setting yields an approximately 40% decrease in prediction latency with minimal reduction in output quality as measured by NDCG.