Evaluating large-scale distributed vertical search

  • Authors:
  • Ke Zhou;Ronan Cummins;Mounia Lalmas;Joemon Jose

  • Affiliations:
  • University of Glasgow, Glasgow, United Kingdom;University of Glasgow, Glasgow, United Kingdom;Yahoo! Research, Barcelona, Spain;University of Glasgow, Glasgow, United Kingdom

  • Venue:
  • Proceedings of the 9th workshop on Large-scale and distributed informational retrieval
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Aggregating search results from a variety of distributed heterogeneous sources, i.e. so-called verticals, such as news, image, video and blog, into a single interface has become a popular paradigm in large-scale web search. As various distributed vertical search techniques (also as known as aggregated search) have been proposed, it is crucial that we need to be able to properly evaluate those systems on a large-scale standard test set. A test collection for aggregated search requires a number of verticals, each populated by items (e.g. documents, images, etc) of that vertical type, a set of topics expressing information needs relating to one or more verticals, and relevance assessments, indicating the relevance of the items and their associated verticals to each of the topics. Building a large-scale test collection for aggregate search is costly in terms of time and resources. In this paper, we propose a methodology to build such a test collection reusing existing test collections, which allows the investigation of aggregated search approaches. We report on experiments, based on twelve simulated aggregated search systems, that show the impact of misclassification of items into verticals to the evaluation of systems.