Probabilistic data fusion on a large document collection

Authors:
David Lillis;Fergus Toolan;Rem Collier;John Dunnion
Affiliations:
School of Computer Science and Informatics, University College Dublin, Dublin 4, Ireland;Faculty of Computing Science, Griffith College Dublin, Dublin 8, Ireland;School of Computer Science and Informatics, University College Dublin, Dublin 4, Ireland;School of Computer Science and Informatics, University College Dublin, Dublin 4, Ireland
Venue:
Artificial Intelligence Review
Year:
2006

Citing 13
Cited 1

Overview of the first TREC conference

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Learning collection fusion strategies

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Analyses of multiple evidence combination

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Inquirus, the NECI meta search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Bayes optimal metasearch: a probabilistic model for combining the results of multiple retrieval systems (poster session)

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Modeling score distributions for combining the outputs of search engines

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Models for metasearch

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Relevance score normalization for metasearch

Proceedings of the tenth international conference on Information and knowledge management
Fusion Via a Linear Combination of Scores

Information Retrieval
STARTS: Stanford Protocol Proposal for Internet Retrieval and Search

STARTS: Stanford Protocol Proposal for Internet Retrieval and Search
Retrieval evaluation with incomplete information

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
ProbFuse: a probabilistic approach to data fusion

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Probability-based fusion of information retrieval result sets

Artificial Intelligence Review

Extending probabilistic data fusion using sliding windows

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data fusion is the process of combining the output of a number of Information Retrieval (IR) algorithms into a single result set, to achieve greater retrieval performance. ProbFuse is a data fusion algorithm that uses the history of the underlying IR algorithms to estimate the probability that subsequent result sets include relevant documents in particular positions. It has been shown to out-perform CombMNZ, the standard data fusion algorithm against which to compare performance, in a number of previous experiments. This paper builds upon this previous work and applies probFuse to the much larger Web Track document collection from the 2004 Text REtreival Conference. The performance of probFuse is compared against that of CombMNZ using a number of evaluation measures and is shown to achieve substantial performance improvements.