A Set-Covering-Based Approach for Overlapping Resource Selection in Distributed Information Retrieval

  • Authors:
  • Xiuhong Wang;Shiguang Ju

  • Affiliations:
  • -;-

  • Venue:
  • CSIE '09 Proceedings of the 2009 WRI World Congress on Computer Science and Information Engineering - Volume 04
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Resource selection, also called server selection, collection selection or database selection, is a foundational problem in distributed information retrieval (DIR). This paper introduces a set-covering-based algorithm for resource selection in DIR, with consideration of overlapping extent between resources. Give different document with different weight according to its position in merged results for question Q. Only results that have not appeared in some earlier selected resource are focused on in later selected resources. The score of each resource is decided by the total weights of those merged results included in, and only the resource with max score is selected in each selecting step. So, the selecting order is the actual rank of selected resources which are used to search the question Q’, which is similar to question Q. The approach saves big searching time due to overlapping between databases and, at the same time, enhances user's recall rate and precision.