A large-scale study of the effect of training set characteristics over learning-to-rank algorithms

Authors:
Evangelos Kanoulas;Stefan Savev;Pavel Metrikov;Virgil Pavlu;Javed Aslam
Affiliations:
University of Sheffield, Sheffield, MA, USA;Northeastern University, Boston, MA, USA;Northeastern University, Boston, MA, USA;Northeastern University, Boston, MA, USA;Northeastern University, Boston, MA, USA
Venue:
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Year:
2011

Citing 3
Cited 0

An efficient boosting algorithm for combining preferences

The Journal of Machine Learning Research
Training linear SVMs in linear time

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Document selection methodologies for efficient and effective learning-to-rank

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this work we describe the results of a large-scale study on the effect of the distribution of labels across the different grades of relevance in the training set on the performance of trained ranking functions. In a controlled experiment we generate a large number of training datasets wih different label distributions and employ three learning to rank algo- rithms over these datasets. We investigate the effect of these distributions on the accuracy of obtained ranking functions to give an insight into the manner training sets should be constructed.