Query-level loss functions for information retrieval

  • Authors:
  • Tao Qin;Xu-Dong Zhang;Ming-Feng Tsai;De-Sheng Wang;Tie-Yan Liu;Hang Li

  • Affiliations:
  • Department of Electronic Engineering, Tsinghua University, Beijing, 100084, PR China;Department of Electronic Engineering, Tsinghua University, Beijing, 100084, PR China;Department of Computer Science and Information Engineering, National Taiwan University, Taiwan 106, ROC;Department of Electronic Engineering, Tsinghua University, Beijing, 100084, PR China;Microsoft Research Asia, No. 49 Zhichun Road, Haidian District, Beijing 100080, PR China;Microsoft Research Asia, No. 49 Zhichun Road, Haidian District, Beijing 100080, PR China

  • Venue:
  • Information Processing and Management: an International Journal
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many machine learning technologies such as support vector machines, boosting, and neural networks have been applied to the ranking problem in information retrieval. However, since originally the methods were not developed for this task, their loss functions do not directly link to the criteria used in the evaluation of ranking. Specifically, the loss functions are defined on the level of documents or document pairs, in contrast to the fact that the evaluation criteria are defined on the level of queries. Therefore, minimizing the loss functions does not necessarily imply enhancing ranking performances. To solve this problem, we propose using query-level loss functions in learning of ranking functions. We discuss the basic properties that a query-level loss function should have and propose a query-level loss function based on the cosine similarity between a ranking list and the corresponding ground truth. We further design a coordinate descent algorithm, referred to as RankCosine, which utilizes the proposed loss function to create a generalized additive ranking model. We also discuss whether the loss functions of existing ranking algorithms can be extended to query-level. Experimental results on the datasets of TREC web track, OHSUMED, and a commercial web search engine show that with the use of the proposed query-level loss function we can significantly improve ranking accuracies. Furthermore, we found that it is difficult to extend the document-level loss functions to query-level loss functions.