Disproving the fusion hypothesis: an analysis of data fusion via effective information retrieval strategies

  • Authors:
  • Steven M. Beitzel;Ophir Frieder;Eric C. Jensen;David Grossman;Abdur Chowdhury;Nazli Goharian

  • Affiliations:
  • Illinois Institute of Technology, Chicago, IL;Illinois Institute of Technology, Chicago, IL;Illinois Institute of Technology, Chicago, IL;Illinois Institute of Technology, Chicago, IL;Illinois Institute of Technology, Chicago, IL;Illinois Institute of Technology, Chicago, IL

  • Venue:
  • Proceedings of the 2003 ACM symposium on Applied computing
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many prior efforts have been devoted to the basic idea that data fusion techniques can improve retrieval effectiveness. Recent work in the area suggests that many approaches, particularly multiple-evidence combinations, can be a successful means of improving the effectiveness of a system. Unfortunately, the conditions favorable to effectiveness improvements have not been made clear. We examine popular data fusion techniques designed to achieve improvements in effectiveness and clarify the conditions required for data fusion to show improvement. We demonstrate that for fusion to improve effectiveness, the result sets being fused must contain a significant number of unique relevant documents. Furthermore, we show that for this improvement to be visible, these unique relevant documents must be highly ranked. In addition, we present a comprehensive discussion on why previous assumptions about the effectiveness of multiple-evidence techniques are misleading. Detailed empirical results and analysis are provided to support our conclusions.