On score distributions and relevance

Authors:
Stephen Robertson
Affiliations:
Microsoft Research, Cambridge, UK
Venue:
ECIR'07 Proceedings of the 29th European conference on IR research
Year:
2007

Citing 4
Cited 21

A probabilistic solution to the selection and fusion problem in distributed information retrieval

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Modeling score distributions for combining the outputs of search engines

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
The score-distributional threshold optimization for adaptive binary classification tasks

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
On Collection Size and Retrieval Effectiveness

Information Retrieval

Where to stop reading a ranked list?: threshold optimization using truncated score distributions

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Score Distributions in Information Retrieval

ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
Modeling the Score Distributions of Relevant and Non-relevant Documents

ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
Ranking List Dispersion as a Query Performance Predictor

ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
A signal-to-noise approach to score normalization

Proceedings of the 18th ACM conference on Information and knowledge management
Score distribution models: assumptions, intuition, and robustness to score manipulation

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Modeling information sources as integrals for effective and efficient source selection

Information Processing and Management: an International Journal
Standard deviation as a query hardness estimator

SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Modeling score distributions in information retrieval

Information Retrieval
Variational bayes for modeling score distributions

Information Retrieval
WSDL term tokenization methods for IR-style Web services discovery

Science of Computer Programming
Predicting Query Performance by Query-Drift Estimation

ACM Transactions on Information Systems (TOIS)
Measuring the ability of score distributions to model relevance

AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
Query performance prediction based on ranking list dispersion

FDIA'09 Proceedings of the Third BCS-IRSG conference on Future Directions in Information Access
Extended expectation maximization for inferring score distributions

ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
On theoretically valid score distributions in information retrieval

ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Investigating performance predictors using monte carlo simulation and score distribution models

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
The tipping point: F-score as a function of the number of retrieved items

Information Processing and Management: an International Journal
On the inference of average precision from score distributions

Proceedings of the 21st ACM international conference on Information and knowledge management
Modelling Score Distributions Without Actual Scores

Proceedings of the 2013 Conference on the Theory of Information Retrieval
Document Score Distribution Models for Query Performance Inference and Prediction

ACM Transactions on Information Systems (TOIS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

We discuss the idea of modelling the statistical distributions of scores of documents, classified as relevant or non-relevant. Various specific combinations of standard statistical distributions have been used for this purpose. Some theoretical considerations indicate problems with some of the choices of pairs of distributions. Specifically, we revisit a generalisation of the well-known inverse relationship between recall and precision: some choices of pairs of distributions violate this generalised relationship. We identify the choices and the violations, and explore some of the consequences of this theoretical view.