The experiments with the linear combination data fusion method in information retrieval

Authors:
Shengli Wu;Yaxin Bi;Xiaoqin Zeng;Lixin Han
Affiliations:
School of Computing and Mathematics, University of Ulster, Northern Ireland, UK;School of Computing and Mathematics, University of Ulster, Northern Ireland, UK;Department of Computer Science, Hohai University, Nanjing, China;Department of Computer Science, Hohai University, Nanjing, China
Venue:
APWeb'08 Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development
Year:
2008

Citing 9
Cited 0

Automatic combination of multiple ranked retrieval systems

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Predicting the performance of linearly combined IR systems

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Models for metasearch

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Condorcet fusion for improved retrieval

Proceedings of the eleventh international conference on Information and knowledge management
Data fusion with estimated weights

Proceedings of the eleventh international conference on Information and knowledge management
Fusion Via a Linear Combination of Scores

Information Retrieval
ProbFuse: a probabilistic approach to data fusion

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Improving high accuracy retrieval by eliminating the uneven correlation effect in data fusion

Journal of the American Society for Information Science and Technology
Data fusion with correlation weights

ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

In data fusion, the linear combination method is a very flexible method since different weights can be assigned to different systems. However, it remains an open question that which weighting schema is good. In many cases, a simple weighting schema was used: for a system, its weight is assigned as its average performance over a group of training queries. In this paper, we empirically investigate the weighting issue. We find that, a series of power functions of average performance, which can be implemented as efficiently as the simple weighting schema, is more effective than the simple weighting schema for data fusion. We also investigate combined weights which concern both performance of component results and dissimilarity among component results. Further performance improvement on data fusion is achievable by using the combined weights.