Information Processing and Management: an International Journal
The effect multiple query representations on information retrieval system performance
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Combining the evidence of multiple query representations for information retrieval
TREC-2 Proceedings of the second conference on Text retrieval conference
Searching distributed collections with inference networks
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Analyses of multiple evidence combination
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
The impact of database selection on distributed searching
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Collection selection and results merging with topically organized U.S. patents and TREC data
Proceedings of the ninth international conference on Information and knowledge management
The open archives initiative: building a low-barrier interoperability framework
Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries
Ranking retrieval systems without relevance judgments
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Modeling score distributions for combining the outputs of search engines
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Condorcet fusion for improved retrieval
Proceedings of the eleventh international conference on Information and knowledge management
Optimizing search engines using clickthrough data
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
On the effectiveness of evaluating retrieval systems in the absence of relevance judgments
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Methods for ranking information retrieval systems without relevance judgments
Proceedings of the 2003 ACM symposium on Applied computing
Full-text federated search of text-based digital libraries in peer-to-peer networks
Information Retrieval
ProbFuse: a probabilistic approach to data fusion
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
The polyrepresentation continuum in IR
IIiX Proceedings of the 1st international conference on Information interaction in context
Information Processing and Management: an International Journal
Eliciting better information need descriptions from users of information search systems
Information Processing and Management: an International Journal
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Probability-based fusion of information retrieval result sets
Artificial Intelligence Review
Metadata harvesting for content-based distributed information retrieval
Journal of the American Society for Information Science and Technology
Inter and intra-document contexts applied in polyrepresentation for best match IR
Information Processing and Management: an International Journal
Introduction to Bayesian Statistics
Introduction to Bayesian Statistics
ISDM at imageCLEF 2010 fusion task
ICPR'10 Proceedings of the 20th International conference on Recognizing patterns in signals, speech, images, and videos
The linear combination data fusion method in information retrieval
DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part II
Linear combination of component results in information retrieval
Data & Knowledge Engineering
Aggregation of multiple judgments for evaluating ordered lists
ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Granularity of weighted averages and use of rate statistics in AggPro
Proceedings of the Winter Simulation Conference
Hi-index | 0.00 |
"Data fusion" refers to the problem in information retrieval (IR) where several lists of documents ranked against a query are to be merged into a single ranked list for presentation to a user. Data fusion is also known as "metasearch." In a digital library setting data fusion may support operations such as federated search based on multiple repository representations. This paper presents a novel approach to the fusion problem: generative model-based Metasearch (GeM). We suggest viewing the appearance of documents in a return set as the outcome of a probabilistic process; some documents are likely to occur in the model, while others are unlikely. Using Bayesian parameter estimation to fit a multinomial distribution based on the return sets to be merged, GeM achieves a final ranking by listing documents in decreasing probability of generation under the induced model. We also introduce what we call "the impatient reader" approach to normalizing document ranks in service to the fusion operation. We report results from several experiments on TREC data suggesting that GeM, informed with impatient reader document scores, operates at state-of-the-art levels of effectiveness.