On the reliability and intuitiveness of aggregated search metrics

  • Authors:
  • Ke Zhou;Mounia Lalmas;Tetsuya Sakai;Ronan Cummins;Joemon M. Jose

  • Affiliations:
  • University of Glasgow, Glasgow, United Kingdom;Yahoo! Labs, Barcelona, Spain;Waseda University, Tokyo, Japan;University of Greenwich, London, United Kingdom;University of Glasgow, Glasgow, United Kingdom

  • Venue:
  • Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Aggregating search results from a variety of diverse verticals such as news, images, videos and Wikipedia into a single interface is a popular web search presentation paradigm. Although several aggregated search (AS) metrics have been proposed to evaluate AS result pages, their properties remain poorly understood. In this paper, we compare the properties of existing AS metrics under the assumptions that (1) queries may have multiple preferred verticals; (2) the likelihood of each vertical preference is available; and (3) the topical relevance assessments of results returned from each vertical is available. We compare a wide range of AS metrics on two test collections. Our main criteria of comparison are (1) discriminative power, which represents the reliability of a metric in comparing the performance of systems, and (2) intuitiveness, which represents how well a metric captures the various key aspects to be measured (i.e. various aspects of a user's perception of AS result pages). Our study shows that the AS metrics that capture key AS components (e.g., vertical selection) have several advantages over other metrics. This work sheds new lights on the further developments and applications of AS metrics.