LETOR: A benchmark collection for research on learning to rank for information retrieval

Authors:
Tao Qin;Tie-Yan Liu;Jun Xu;Hang Li
Affiliations:
Microsoft Research Asia, Beijing, China;Microsoft Research Asia, Beijing, China;Microsoft Research Asia, Beijing, China;Microsoft Research Asia, Beijing, China
Venue:
Information Retrieval
Year:
2010

Citing 37
Cited 38

OHSUMED: an interactive retrieval evaluation and new large test collection for research

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
A study of smoothing methods for language models applied to Ad Hoc information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Modern Information Retrieval

Modern Information Retrieval
Cumulated gain-based evaluation of IR techniques

ACM Transactions on Information Systems (TOIS)
Optimizing search engines using clickthrough data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining the peanut gallery: opinion extraction and semantic classification of product reviews

WWW '03 Proceedings of the 12th international conference on World Wide Web
Beyond independent relevance: methods and evaluation metrics for subtopic retrieval

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
An efficient boosting algorithm for combining preferences

The Journal of Machine Learning Research
RCV1: A New Benchmark Collection for Text Categorization Research

The Journal of Machine Learning Research
Simple BM25 extension to multiple weighted fields

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Ranking definitions with supervised learning methods

WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Ranking algorithms for named-entity extraction: boosting and the voted perceptron

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Exploiting the hierarchical structure for link analysis

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Title extraction from bodies of HTML documents and its application to web page retrieval

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
A study of relevance propagation for web search

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
MailRank: using ranking for spam detection

Proceedings of the 14th ACM international conference on Information and knowledge management
Learning to rank using gradient descent

ICML '05 Proceedings of the 22nd international conference on Machine learning
TREC: Experiment and Evaluation in Information Retrieval (Digital Libraries and Electronic Publishing)

TREC: Experiment and Evaluation in Information Retrieval (Digital Libraries and Electronic Publishing)
Topical link analysis for web search

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Adapting ranking SVM to document retrieval

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
High accuracy retrieval with multiple nested ranker

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Learning to rank: from pairwise approach to listwise approach

Proceedings of the 24th international conference on Machine learning
A support vector method for optimizing average precision

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Ranking with multiple hyperplanes

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
FRank: a ranking method with fidelity loss

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
AdaRank: a boosting algorithm for information retrieval

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Combating web spam with trustrank

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
SoftRank: optimizing non-smooth rank metrics

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Query-level loss functions for information retrieval

Information Processing and Management: an International Journal
Learning to rank relational objects and its application to web search

Proceedings of the 17th international conference on World Wide Web
Listwise approach to learning to rank: theory and algorithm

Proceedings of the 25th international conference on Machine learning
Directly optimizing evaluation measures in learning to rank

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Query dependent ranking using K-nearest neighbor

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Max-margin Classification of Data with Absent Features

The Journal of Machine Learning Research
BoltzRank: learning to maximize expected ranking gain

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning

Classification-enhanced ranking

Proceedings of the 19th international conference on World wide web
Learning to rank only using training data from related domain

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Extending average precision to graded relevance judgments

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Fast active exploration for link-based preference learning using Gaussian processes

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
Automatically generating labels based on unified click model

Proceedings of the 20th international conference companion on World wide web
Weight-based boosting model for cross-domain relevance ranking adaptation

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Query weighting for ranking model adaptation

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Distributed tuning of machine learning algorithms using MapReduce Clusters

Proceedings of the Third Workshop on Large Scale Data Mining: Theory and Applications
Bagging gradient-boosted trees for high precision, low variance ranking models

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Relevant knowledge helps in choosing right teacher: active query selection for ranking adaptation

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Active learning to maximize accuracy vs. effort in interactive information retrieval

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Intent-aware search result diversification

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Transductive learning over automatically detected themes for multi-document summarization

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Rule-based active sampling for learning to rank

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Search behavior-driven training for result re-ranking

TPDL'11 Proceedings of the 15th international conference on Theory and practice of digital libraries: research and advanced technology for digital libraries
Semi-supervised learning to rank with preference regularization

Proceedings of the 20th ACM international conference on Information and knowledge management
Advertiser-centric approach to understand user click behavior in sponsored search

Proceedings of the 20th ACM international conference on Information and knowledge management
Flexible sample selection strategies for transfer learning in ranking

Information Processing and Management: an International Journal
Construct weak ranking functions for learning linear ranking function

AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
Heterogeneous domain adaptation using manifold alignment

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Top-k learning to rank: labeling, ranking and evaluation

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Semantic preference retrieval for querying knowledge bases

Proceedings of the 1st Joint International Workshop on Entity-Oriented and Semantic Search
An adaptive learning to rank algorithm: Learning automata approach

Decision Support Systems
Knowledge acquisition from many-attribute data by genetic programming with clustered terminal symbols

International Journal of Knowledge and Web Intelligence
Efficiently learning the preferences of people

Machine Learning
Improving on-demand learning to rank through parallelism

WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Applying reinforcement learning for web pages ranking algorithms

Applied Soft Computing
Popularity-based relevance propagation

Journal of Web Engineering
Direct optimization of ranking measures for learning to rank models

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Modeling click and relevance relationship for sponsored search

Proceedings of the 22nd international conference on World Wide Web companion
About learning models with multiple query-dependent features

ACM Transactions on Information Systems (TOIS)
Efficient gradient descent algorithm for sparse models with application in learning-to-rank

Knowledge-Based Systems
Entity-centric document filtering: boosting feature mapping through meta-features

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Learning to rank query suggestions for adhoc and diversity search

Information Retrieval
Slash-based relevance propagation model for topic distillation

Journal of Web Engineering
Democracy is good for ranking: towards multi-view rank learning and adaptation in web search

Proceedings of the 7th ACM international conference on Web search and data mining
The whens and hows of learning to rank for web search

Information Retrieval
Learning to Rank with Extreme Learning Machine

Neural Processing Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

LETOR is a benchmark collection for the research on learning to rank for information retrieval, released by Microsoft Research Asia. In this paper, we describe the details of the LETOR collection and show how it can be used in different kinds of researches. Specifically, we describe how the document corpora and query sets in LETOR are selected, how the documents are sampled, how the learning features and meta information are extracted, and how the datasets are partitioned for comprehensive evaluation. We then compare several state-of-the-art learning to rank algorithms on LETOR, report their ranking performances, and make discussions on the results. After that, we discuss possible new research topics that can be supported by LETOR, in addition to algorithm comparison. We hope that this paper can help people to gain deeper understanding of LETOR, and enable more interesting research projects on learning to rank and related topics.