Increasing evaluation sensitivity to diversity

  • Authors:
  • Peter B. Golbus;Javed A. Aslam;Charles L. Clarke

  • Affiliations:
  • College of Computer and Information Science, Northeastern University, Boston, USA 02115;College of Computer and Information Science, Northeastern University, Boston, USA 02115;School of Computer Science, University of Waterloo, Waterloo, Canada N2L 3G1

  • Venue:
  • Information Retrieval
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many queries have multiple interpretations; they are ambiguous or underspecified. This is especially true in the context of Web search. To account for this, much recent research has focused on creating systems that produce diverse ranked lists. In order to validate these systems, several new evaluation measures have been created to quantify diversity. Ideally, diversity evaluation measures would distinguish between systems by the amount of diversity in the ranked lists they produce. Unfortunately, diversity is also a function of the collection over which the system is run and a system's performance at ad-hoc retrieval. A ranked list built from a collection that does not cover multiple subtopics cannot be diversified; neither can a ranked list that contains no relevant documents. To ensure that we are assessing systems by their diversity, we develop (1) a family of evaluation measures that take into account the diversity of the collection and (2) a meta-evaluation measure that explicitly controls for performance. We demonstrate experimentally that our new measures can achieve substantial improvements in sensitivity to diversity without reducing discriminative power.