Improving high accuracy retrieval by eliminating the uneven correlation effect in data fusion

Authors:
Shengli Wu;Sally McClean
Affiliations:
School of Computing and Mathematics, University of Ulster, Newtownabbey, United Kingdom, BT 37 0QB;School of Computing and Mathematics, University of Ulster, Newtownabbey, United Kingdom, BT 37 0QB
Venue:
Journal of the American Society for Information Science and Technology
Year:
2006

Citing 29
Cited 10

Evaluation of an inference network-based retrieval model

ACM Transactions on Information Systems (TOIS) - Special issue on research and development in information retrieval
Personalized information delivery: an analysis of information filtering methods

Communications of the ACM - Special issue on information filtering
The effect multiple query representations on information retrieval system performance

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic combination of multiple ranked retrieval systems

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Combining the evidence of multiple query representations for information retrieval

TREC-2 Proceedings of the second conference on Text retrieval conference
Analyses of multiple evidence combination

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Predicting the performance of linearly combined IR systems

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Inquirus, the NECI meta search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Predicting the effectiveness of Naïve data fusion on the basis of system characteristics

Journal of the American Society for Information Science
Rank aggregation methods for the Web

Proceedings of the 10th international conference on World Wide Web
Ranking retrieval systems without relevance judgments

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Modeling score distributions for combining the outputs of search engines

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Models for metasearch

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Relevance score normalization for metasearch

Proceedings of the tenth international conference on Information and knowledge management
Expert agreement and content based reranking in a meta search environment using Mearf

Proceedings of the 11th international conference on World Wide Web
Condorcet fusion for improved retrieval

Proceedings of the eleventh international conference on Information and knowledge management
Data fusion with estimated weights

Proceedings of the eleventh international conference on Information and knowledge management
Fusion Via a Linear Combination of Scores

Information Retrieval
Context and Page Analysis for Improved Web Search

IEEE Internet Computing
Web metasearch: rank vs. score based rank aggregation methods

Proceedings of the 2003 ACM symposium on Applied computing
Evaluating high accuracy retrieval techniques

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Scaling IR-system evaluation using term relevance sets

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Fusion of effective retrieval strategies in the same information retrieval system

Journal of the American Society for Information Science and Technology
Comparing Rank and Score Combination Methods for Data Fusion in Information Retrieval

Information Retrieval
Surrogate scoring for improved metasearch precision

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic ranking of information retrieval systems using data fusion

Information Processing and Management: an International Journal
Performance prediction of data fusion for information retrieval

Information Processing and Management: an International Journal
Data fusion with correlation weights

ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
Independence of contributing retrieval strategies in data fusion for effective information retrieval

IRSG'98 Proceedings of the 20th Annual BCS-IRSG conference on Information Retrieval Research

Inter and intra-document contexts applied in polyrepresentation for best match IR

Information Processing and Management: an International Journal
Applying statistical principles to data fusion in information retrieval

Expert Systems with Applications: An International Journal
Assigning appropriate weights for the linear combination data fusion method in information retrieval

Information Processing and Management: an International Journal
Performance weights for the linear combination data fusion method in information retrieval

ISMIS'08 Proceedings of the 17th international conference on Foundations of intelligent systems
The experiments with the linear combination data fusion method in information retrieval

APWeb'08 Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development
The linear combination data fusion method in information retrieval

DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part II
Applying the data fusion technique to blog opinion retrieval

Expert Systems with Applications: An International Journal
Linear combination of component results in information retrieval

Data & Knowledge Engineering
Fusing different information retrieval systems according to query-topics: a study based on correlation in information retrieval systems and TREC topics

Information Retrieval
The weighted Condorcet fusion in information retrieval

Information Processing and Management: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

The aim of this research is twofold. On the one hand, high accuracy retrieval has been a concern of the information retrieval community for some time. We aim to investigate this issue via data fusion. On the other hand, the correlation among component results has been proven harmful to data fusion, but it has not been taken into account in data fusion algorithms. In the hope of achieving better performance, we propose a group of algorithms to eliminate the effect of uneven correlation among component results by assigning different weights to all component results or their combinations. Then the linear combination method or a variation is used for fusion. Extensive experimentation is carried out to evaluate the performances of these algorithms with six groups of component results, which are the top 10 systems submitted to Text REtrieval Conference (TREC) 6, 7, 8, 9, 2001, and 2002. The experimental results show that all eight data fusion methods involved outperform the best component system on average. Therefore, we demonstrate that the data fusion technique in general is effective with accurate retrieval results. The experimental results also demonstrate that all six methods presented in this article are effective for eliminating the effect of uneven correlation among component results. All of them outperform CombSum and five of them outperform CombMNZ on average. © 2006 Wiley Periodicals, Inc.