Result merging methods in distributed information retrieval with overlapping databases

  • Authors:
  • Shengli Wu;Sally McClean

  • Affiliations:
  • School of Computing and Mathematics, University of Ulster, Northern Ireland, UK;School of Computing and Mathematics, University of Ulster, Northern Ireland, UK

  • Venue:
  • Information Retrieval
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In distributed information retrieval systems, document overlaps occur frequently among different component databases. This paper presents an experimental investigation and evaluation of a group of result merging methods including the shadow document method and the multi-evidence method in the environment of overlapping databases. We assume, with the exception of resultant document lists (either with rankings or scores), no extra information about retrieval servers and text databases is available, which is the usual case for many applications on the Internet and the Web.The experimental results show that the shadow document method and the multi-evidence method are the two best methods when overlap is high, while Round-robin is the best for low overlap. The experiments also show that [0,1] linear normalization is a better option than linear regression normalization for result merging in a heterogeneous environment.