Learning to rank relevant and novel documents through user feedback

  • Authors:
  • Abhimanyu Lad;Yiming Yang

  • Affiliations:
  • Carnegie Mellon University, Pittsburgh, PA, USA;Carnegie Mellon University, Pittsburgh, PA, USA

  • Venue:
  • CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the problem of learning to rank relevant and novel documents so as to directly maximize a performance metric called Expected Global Utility (EGU), which has several desirable properties: (i) It measures retrieval performance in terms of relevant as well as novel information, (ii) gives more importance to top ranks to reflect common browsing behavior of users, as opposed to existing objective functions based on set-coverage, (iii) accommodates different levels of tolerance towards redundancy, which is not taken into account by existing evaluation measures, and (iv) extends naturally to the evaluation of session-based retrieval comprising multiple ranked lists. Our ground truth is defined in terms of "information nuggets", which are obviously not known to the retrieval system when processing a new user query. Therefore, our approach uses observable query and document features (words and named entities) as surrogates for nuggets, whose weights are learned based on user feedback in an iterative search session. The ranked list is produced to maximize the weighted coverage of these surrogate nuggets. The optimization of such coverage-based metrics is known to be NP-hard. Therefore, we use a greedy algorithm and show that it guarantees good performance due to the submodularity of the objective function. Our experiments on Topic Detection and Tracking data show that the proposed approach represents an efficient and effective retrieval strategy for maximizing EGU, as compared to a purely-relevance based ranking approach that uses Indri, as well as a MMR-based approach for non-redundant ranking.