The linear combination data fusion method in information retrieval

Authors:
Shengli Wu;Yaxin Bi;Xiaoqin Zeng
Affiliations:
School of Computer Science and Telecommunication Engineering, Jiangsu University, Zhenjiang, China and School of Computing and Mathematics, University of Ulster, Northern Ireland, UK;School of Computing and Mathematics, University of Ulster, Northern Ireland, UK;College of Computer and Information Engineering, Hehai University, Nanjing, China
Venue:
DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part II
Year:
2011

Citing 25
Cited 1

Automatic combination of multiple ranked retrieval systems

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Analyses of multiple evidence combination

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Predicting the performance of linearly combined IR systems

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Database merging strategy based on logistic regression

Information Processing and Management: an International Journal
Rank aggregation methods for the Web

Proceedings of the 10th international conference on World Wide Web
Modeling score distributions for combining the outputs of search engines

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Models for metasearch

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Relevance score normalization for metasearch

Proceedings of the tenth international conference on Information and knowledge management
Condorcet fusion for improved retrieval

Proceedings of the eleventh international conference on Information and knowledge management
Data fusion with estimated weights

Proceedings of the eleventh international conference on Information and knowledge management
Fusion Via a Linear Combination of Scores

Information Retrieval
From Retrieval Status Values to Probabilities of Relevance for Advanced IR Applications

Information Retrieval
Web metasearch: rank vs. score based rank aggregation methods

Proceedings of the 2003 ACM symposium on Applied computing
Improving high accuracy retrieval by eliminating the uneven correlation effect in data fusion

Journal of the American Society for Information Science and Technology
An outranking approach for rank aggregation in information retrieval

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Regression Relevance Models for Data Fusion

DEXA '07 Proceedings of the 18th International Conference on Database and Expert Systems Applications
Applying statistical principles to data fusion in information retrieval

Expert Systems with Applications: An International Journal
Assigning appropriate weights for the linear combination data fusion method in information retrieval

Information Processing and Management: an International Journal
Generative model-based metasearch for data fusion in information retrieval

Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries
Segmentation of search engine results for effective data-fusion

ECIR'07 Proceedings of the 29th European conference on IR research
Estimating probabilities for effective data fusion

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
A late fusion approach to cross-lingual document re-ranking

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Evaluating score normalization methods in data fusion

AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
Data fusion with correlation weights

ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
Selecting the n-top retrieval result lists for an effective data fusion

CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing

Mixture model with multiple centralized retrieval algorithms for result merging in federated search

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

In information retrieval, data fusion has been investigated by many researchers. Previous investigation and experimentation demonstrate that the linear combination method is an effective data fusion method for combining multiple information retrieval results. One advantage is its flexibility since different weights can be assigned to different component systems so as to obtain better fusion results. However, how to obtain suitable weights for all the component retrieval systems is still an open problem. In this paper, we use the multiple linear regression technique to obtain optimum weights for all involved component systems. Optimum is in the least squares sense that minimize the difference between the estimated scores of all documents by linear combination and the judged scores of those documents. Our experiments with four groups of runs submitted to TREC show that the linear combination method with such weights steadily outperforms the best component system and other major data fusion methods such as CombSum, CombMNZ, and the linear combination method with performance level/performance square weighting schemas by large margins.