Relevance score normalization for metasearch
Proceedings of the tenth international conference on Information and knowledge management
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Performance prediction using spatial autocorrelation
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
SoftRank: optimizing non-smooth rank metrics
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Risk-Aware Information Retrieval
ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Ranking List Dispersion as a Query Performance Predictor
ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
Standard deviation as a query hardness estimator
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Document Score Distribution Models for Query Performance Inference and Prediction
ACM Transactions on Information Systems (TOIS)
Hi-index | 0.00 |
In this paper, we consider the task of estimating query effectiveness, i.e., assessment of the retrieval system performance in absence of the user relevance judgments. In our approach we model the score associated with each document in the result set as a Gaussian random variable. The mean and the variance of each document score can then be used to estimate the probability that a document will be ranked above another one and thus calculate the expected rank of the document in the ranked list. We propose to measure the effectiveness of the system performance by comparing the predicted and actual ranks of the retrieved documents. In our experiments we consider two retrieval models and five document scoring methods and evaluate their impact on the proposed estimation measures. Our experiments with standardized data sets that include document relevance judgments and the task of predicting the relative query effectiveness show that the expected rank metric is robust to variations in document scoring and retrieval algorithms.