A large-scale study of the effect of training set characteristics over learning-to-rank algorithms

  • Authors:
  • Evangelos Kanoulas;Stefan Savev;Pavel Metrikov;Virgil Pavlu;Javed Aslam

  • Affiliations:
  • University of Sheffield, Sheffield, MA, USA;Northeastern University, Boston, MA, USA;Northeastern University, Boston, MA, USA;Northeastern University, Boston, MA, USA;Northeastern University, Boston, MA, USA

  • Venue:
  • Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this work we describe the results of a large-scale study on the effect of the distribution of labels across the different grades of relevance in the training set on the performance of trained ranking functions. In a controlled experiment we generate a large number of training datasets wih different label distributions and employ three learning to rank algo- rithms over these datasets. We investigate the effect of these distributions on the accuracy of obtained ranking functions to give an insight into the manner training sets should be constructed.