Diagnostic Evaluation of Information Retrieval Models

Authors:
Hui Fang;Tao Tao;Chengxiang Zhai
Affiliations:
University of Delaware;Microsoft Corporation;University of Illinois at Urbana-Champaign
Venue:
ACM Transactions on Information Systems (TOIS)
Year:
2011

Citing 26
Cited 8

On modeling information retrieval with probabilistic inference

ACM Transactions on Information Systems (TOIS)
Pivoted document length normalization

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
On relevance weights with little relevance information

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Readings in information retrieval

Readings in information retrieval
Exploring the similarity space

ACM SIGIR Forum
A language modeling approach to information retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Efficient construction of large test collections

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
How reliable are the results of large-scale information retrieval experiments?

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Ranking retrieval systems without relevance judgments

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Relevance based language models

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A study of smoothing methods for language models applied to Ad Hoc information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Model-based feedback in the language modeling approach to information retrieval

Proceedings of the tenth international conference on Information and knowledge management
Application of aboutness to functional benchmarking in information retrieval

ACM Transactions on Information Systems (TOIS)
Probabilistic models of information retrieval based on measuring the divergence from randomness

ACM Transactions on Information Systems (TOIS)
Forming test collections with no system pooling

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A formal study of information retrieval heuristics

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
SIGIR 2004 workshop: RIA and "where can IR go from here?"

ACM SIGIR Forum
A study of the dirichlet priors for term frequency normalisation

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
An exploration of axiomatic approaches to information retrieval

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Gravitation-based model for information retrieval

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Incremental test collections

Proceedings of the 14th ACM international conference on Information and knowledge management
Semantic term matching in axiomatic approaches to information retrieval

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Minimal test collections for retrieval evaluation

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Ranking robustness: a novel framework to predict query performance

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
An exploration of proximity measures in information retrieval

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
TREC: Continuing information retrieval's tradition of experimentation

Communications of the ACM

Lower-bounding term frequency normalization

Proceedings of the 20th ACM international conference on Information and knowledge management
Editorial: Profit-based scheduling and channel allocation for multi-item requests in real-time on-demand data broadcast systems

Data & Knowledge Engineering
On using a quantum physics formalism for multidocument summarization

Journal of the American Society for Information Science and Technology
Axiomatic analysis of translation language model for information retrieval

ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
A novel TF-IDF weighting scheme for effective ranking

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Understanding Similarity Metrics in Neighbour-based Recommender Systems

Proceedings of the 2013 Conference on the Theory of Information Retrieval
A Diagnostic Study of Search Result Diversification Methods

Proceedings of the 2013 Conference on the Theory of Information Retrieval
Tie Breaker: A Novel Way of Combining Retrieval Signals

Proceedings of the 2013 Conference on the Theory of Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Developing effective retrieval models is a long-standing central challenge in information retrieval research. In order to develop more effective models, it is necessary to understand the deficiencies of the current retrieval models and the relative strengths of each of them. In this article, we propose a general methodology to analytically and experimentally diagnose the weaknesses of a retrieval function, which provides guidance on how to further improve its performance. Our methodology is motivated by the empirical observation that good retrieval performance is closely related to the use of various retrieval heuristics. We connect the weaknesses and strengths of a retrieval function with its implementations of these retrieval heuristics, and propose two strategies to check how well a retrieval function implements the desired retrieval heuristics. The first strategy is to formalize heuristics as constraints, and use constraint analysis to analytically check the implementation of retrieval heuristics. The second strategy is to define a set of relevance-preserving perturbations and perform diagnostic tests to empirically evaluate how well a retrieval function implements retrieval heuristics. Experiments show that both strategies are effective to identify the potential problems in implementations of the retrieval heuristics. The performance of retrieval functions can be improved after we fix these problems.