Linear combination of component results in information retrieval

  • Authors:
  • Shengli Wu

  • Affiliations:
  • -

  • Venue:
  • Data & Knowledge Engineering
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In information retrieval, data fusion (also known as meta-search) has been investigated by many researchers. Previous investigation and experimentation demonstrate that the linear combination method is an effective data fusion method for combining multiple information retrieval results. One advantage is its flexibility, since different weights can be assigned to different component systems so as to obtain better fusion results. The key issue is how to assign good weights to all the component retrieval systems involved. Surprisingly, research in this field is limited and it is still an open question. In this paper, we use the multiple linear regression technique with estimated relevance scores and judged scores to obtain suitable weights. Although the multiple linear regression technique is not new, the way of using it in this paper has never been attempted before for the data fusion problem in information retrieval. Our experiments with five groups of runs submitted to TREC show that the linear combination method with such a weighting strategy steadily outperforms the best component system and other data fusion methods including CombSum, CombMNZ, PosFuse, MAPFuse, SegFuse, and the linear combination method with performance level/performance square weighting schemes by large margins.