Adaptive query-based sampling of distributed collections

  • Authors:
  • Mark Baillie;Leif Azzopardi;Fabio Crestani

  • Affiliations:
  • Department of Computing and Information Sciences, University of Strathclyde, Glasgow, UK;Department of Computing and Information Sciences, University of Strathclyde, Glasgow, UK;Department of Computing and Information Sciences, University of Strathclyde, Glasgow, UK

  • Venue:
  • SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

As part of a Distributed Information Retrieval system a description of each remote information resource, archive or repository is usually stored centrally in order to facilitate resource selection. The acquisition of precise resource descriptions is therefore an important phase in Distributed Information Retrieval, as the quality of such representations will impact on selection accuracy, and ultimately retrieval performance. While Query-Based Sampling is currently used for content discovery of uncooperative resources, the application of this technique is dependent upon heuristic guidelines to determine when a sufficiently accurate representation of each remote resource has been obtained. In this paper we address this shortcoming by using the Predictive Likelihood to provide both an indication of the quality of an acquired resource description estimate, and when a sufficiently good representation of a resource has been obtained during Query-Based Sampling.