Estimating retrieval effectiveness using rank distributions

Authors:
Vishwa Vinay;Natasa Milic-Frayling;Ingemar Cox
Affiliations:
Microsoft Research, Cambridge, United Kingdom;Microsoft Research, Cambridge, United Kingdom;University College London, London, United Kingdom
Venue:
Proceedings of the 17th ACM conference on Information and knowledge management
Year:
2008

Citing 5
Cited 4

Relevance score normalization for metasearch

Proceedings of the tenth international conference on Information and knowledge management
Predicting query performance

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Learning to estimate query difficulty: including applications to missing content detection and distributed information retrieval

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Performance prediction using spatial autocorrelation

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
SoftRank: optimizing non-smooth rank metrics

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining

Risk-Aware Information Retrieval

ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Ranking List Dispersion as a Query Performance Predictor

ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
Standard deviation as a query hardness estimator

SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Document Score Distribution Models for Query Performance Inference and Prediction

ACM Transactions on Information Systems (TOIS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we consider the task of estimating query effectiveness, i.e., assessment of the retrieval system performance in absence of the user relevance judgments. In our approach we model the score associated with each document in the result set as a Gaussian random variable. The mean and the variance of each document score can then be used to estimate the probability that a document will be ranked above another one and thus calculate the expected rank of the document in the ranked list. We propose to measure the effectiveness of the system performance by comparing the predicted and actual ranks of the retrieved documents. In our experiments we consider two retrieval models and five document scoring methods and evaluate their impact on the proposed estimation measures. Our experiments with standardized data sets that include document relevance judgments and the task of predicting the relative query effectiveness show that the expected rank metric is robust to variations in document scoring and retrieval algorithms.