Collection-integral source selection for uncooperative distributed information retrieval environments

  • Authors:
  • Georgios Paltoglou;Michail Salampasis;Maria Satratzemi

  • Affiliations:
  • University of Macedonia, Egnatias 156, P.O. Box 54006, Greece;Alexander Technological Educational Institute of Thessaloniki, P.O. Box 141, 57400 Thessaloniki, Greece;University of Macedonia, Egnatias 156, P.O. Box 54006, Greece

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2010

Quantified Score

Hi-index 0.07

Visualization

Abstract

We propose a new integral-based source selection algorithm for uncooperative distributed information retrieval environments. The algorithm functions by modeling each source as a plot, using the relevance score and the intra-collection position of its sampled documents in reference to a centralized sample index. Based on the above modeling, the algorithm locates the collections that contain the most relevant documents. A number of transformations are applied to the original plot, in order to reward collections that have higher scoring documents and dampen the effect of collections returning an excessive number of documents. The family of linear interpolant functions that pass through the points of the modified plot is computed for each available source and the area that they cover in the rank-relevance space is calculated. Information sources are ranked based on the area that they cover. Based on this novel metric for collection relevance, the algorithm is tested in a variety of testbeds in both recall and precision oriented settings and its performance is found to be better or at least equal to previous state-of-the-art approaches, overall constituting a very effective and robust solution.