Some inconsistencies and misnomers in probabilistic information retrieval
SIGIR '91 Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic retrieval based on staged logistic regression
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic retrieval revisited
The Computer Journal - Special issue on information retrieval
Evaluating and optimizing autonomous text classification systems
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Analyses of multiple evidence combination
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
A probabilistic solution to the selection and fusion problem in distributed information retrieval
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Estimating precision by random sampling (poster abstract)
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Modeling score distributions for combining the outputs of search engines
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
The score-distributional threshold optimization for adaptive binary classification tasks
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Maximum likelihood estimation for filtering thresholds
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Information Retrieval
Pattern Recognition and Neural Networks
Pattern Recognition and Neural Networks
On Collection Size and Retrieval Effectiveness
Information Retrieval
Relevance weighting for query independent evidence
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Using historical data to enhance rank aggregation
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Where to stop reading a ranked list?: threshold optimization using truncated score distributions
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Modeling the Score Distributions of Relevant and Non-relevant Documents
ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
A signal-to-noise approach to score normalization
Proceedings of the 18th ACM conference on Information and knowledge management
From uncertain inference to probability of relevance for advanced IR applications
ECIR'03 Proceedings of the 25th European conference on IR research
On score distributions and relevance
ECIR'07 Proceedings of the 29th European conference on IR research
Combination methods for crosslingual web retrieval
CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories
Probabilistic score normalization for rank aggregation
ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
Modeling document scores for distributed information retrieval
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Predicting Query Performance by Query-Drift Estimation
ACM Transactions on Information Systems (TOIS)
Measuring the ability of score distributions to model relevance
AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
Predicting query performance directly from score distributions
AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
Score transformation in linear combination for multi-criteria relevance ranking
ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Extended expectation maximization for inferring score distributions
ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
On theoretically valid score distributions in information retrieval
ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Explicit relevance models in intent-oriented information retrieval diversification
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Investigating performance predictors using monte carlo simulation and score distribution models
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
On the inference of average precision from score distributions
Proceedings of the 21st ACM international conference on Information and knowledge management
Taily: shard selection using the tail of score distributions
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Copulas for information retrieval
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Modelling Score Distributions Without Actual Scores
Proceedings of the 2013 Conference on the Theory of Information Retrieval
Document Score Distribution Models for Query Performance Inference and Prediction
ACM Transactions on Information Systems (TOIS)
Hi-index | 0.00 |
We review the history of modeling score distributions, focusing on the mixture of normal-exponential by investigating the theoretical as well as the empirical evidence supporting its use. We discuss previously suggested conditions which valid binary mixture models should satisfy, such as the Recall-Fallout Convexity Hypothesis, and formulate two new hypotheses considering the component distributions, individually as well as in pairs, under some limiting conditions of parameter values. From all the mixtures suggested in the past, the current theoretical argument points to the two gamma as the most-likely universal model, with the normal-exponential being a usable approximation. Beyond the theoretical contribution, we provide new experimental evidence showing vector space or geometric models, and BM25, as being `friendly' to the normal-exponential, and that the non-convexity problem that the mixture possesses is practically not severe. Furthermore, we review recent non-binary mixture models, speculate on graded relevance, and consider methods such as logistic regression for score calibration.