A probabilistic solution to the selection and fusion problem in distributed information retrieval
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Modeling score distributions for combining the outputs of search engines
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
The score-distributional threshold optimization for adaptive binary classification tasks
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Maximum likelihood estimation for filtering thresholds
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
On score distributions and relevance
ECIR'07 Proceedings of the 29th European conference on IR research
Score Distributions in Information Retrieval
ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
A signal-to-noise approach to score normalization
Proceedings of the 18th ACM conference on Information and knowledge management
Score distribution models: assumptions, intuition, and robustness to score manipulation
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
www.MMRetrieval.net: a multimodal search engine
Proceedings of the Third International Conference on SImilarity Search and APplications
Modeling score distributions in information retrieval
Information Retrieval
Dynamic two-stage image retrieval from large multimodal databases
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Fusion vs. two-stage for multimodal retrieval
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
A cascade ranking model for efficient ranked retrieval
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Bag-of-visual-words vs global image descriptors on two-stage multimodal retrieval
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Learning to advertise: how many ads are enough?
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
Protocol-driven searches for medical and health-sciences systematic reviews
ICTIR'11 Proceedings of the Third international conference on Advances in information retrieval theory
Predicting Query Performance by Query-Drift Estimation
ACM Transactions on Information Systems (TOIS)
Measuring the ability of score distributions to model relevance
AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
Extended expectation maximization for inferring score distributions
ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
On the inference of average precision from score distributions
Proceedings of the 21st ACM international conference on Information and knowledge management
Dynamic two-stage image retrieval from large multimedia databases
Information Processing and Management: an International Journal
Modelling Score Distributions Without Actual Scores
Proceedings of the 2013 Conference on the Theory of Information Retrieval
The whens and hows of learning to rank for web search
Information Retrieval
Document Score Distribution Models for Query Performance Inference and Prediction
ACM Transactions on Information Systems (TOIS)
Hi-index | 0.00 |
Ranked retrieval has a particular disadvantage in comparison with traditional Boolean retrieval: there is no clear cut-off point where to stop consulting results. This is a serious problem in some setups. We investigate and further develop methods to select the rank cut-off value which optimizes a given effectiveness measure. Assuming no other input than a system's output for a query--document scores and their distribution--the task is essentially a score-distributional threshold optimization problem. The recent trend in modeling score distributions is to use a normal-exponential mixture: normal for relevant, and exponential for non-relevant document scores. We discuss the two main theoretical problems with the current model, support incompatibility and non-convexity, and develop new models that address them. The main contributions of the paper are two truncated normal-exponential models, varying in the way the out-truncated score ranges are handled. We conduct a range of experiments using the TREC 2007 and 2008 Legal Track data, and show that the truncated models lead to significantly better results.