The effect of adding relevance information in a relevance feedback environment
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating and optimizing autonomous text classification systems
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Boosting and Rocchio applied to text filtering
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Learning while filtering documents
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A probabilistic solution to the selection and fusion problem in distributed information retrieval
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A critical examination of TDT's cost function
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Information Filtering in TREC-9 and TDT-3: A Comparative Analysis
Information Retrieval
Threshold Setting and Performance Optimization in Adaptive Filtering
Information Retrieval
Extreme value theory applied to document retrieval from large collections
Information Retrieval
Answering bounded continuous search queries in the world wide web
Proceedings of the 16th international conference on World Wide Web
Automatic query-time generation of retrieval expert coefficients for multimedia retrieval
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A formal approach to score normalization for meta-search
HLT '02 Proceedings of the second international conference on Human Language Technology Research
Towards a belief-revision-based adaptive and context-sensitive information retrieval system
ACM Transactions on Information Systems (TOIS)
Where to stop reading a ranked list?: threshold optimization using truncated score distributions
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Score Distributions in Information Retrieval
ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
Modeling the Score Distributions of Relevant and Non-relevant Documents
ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
A signal-to-noise approach to score normalization
Proceedings of the 18th ACM conference on Information and knowledge management
On score distributions and relevance
ECIR'07 Proceedings of the 29th European conference on IR research
Optimization of bounded continuous search queries based on ranking distributions
WISE'07 Proceedings of the 8th international conference on Web information systems engineering
Score distribution models: assumptions, intuition, and robustness to score manipulation
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Modeling score distributions in information retrieval
Information Retrieval
Variational bayes for modeling score distributions
Information Retrieval
Automatic threshold estimation for data matching applications
Information Sciences: an International Journal
Extended expectation maximization for inferring score distributions
ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Practical considerations when filtering documents
Proceedings of the 4th Information Interaction in Context Symposium
Modelling Score Distributions Without Actual Scores
Proceedings of the 2013 Conference on the Theory of Information Retrieval
Document Score Distribution Models for Query Performance Inference and Prediction
ACM Transactions on Information Systems (TOIS)
Hi-index | 0.00 |
The thresholding of document scores has proved critical for the effectiveness of classification tasks. We review the most important approaches to thresholding, and introduce thescore-distributional (S-D) threshold optimizationmethod. The method is based on score distributions and is capable of optimizing any effectiveness measure defined in terms of the traditional contingency table.As a byproduct, we provide a model forscore distributions, and demonstrate its high accuracy in describing empirical data. The estimation method can be performed incrementally, a highly desirable feature for adaptive environments. Our work in modeling score distributions is useful beyond threshold optimization problems. It directly applies to other retrieval environments that make use of score distributions,e.g., distributed retrieval, or topic detection and tracking.The most accurate version of S-D thresholding --- although incremental --- can be computationally heavy. Therefore, we also investigate more practical solutions. We suggest practical approximations and discuss adaptivity, threshold initialization, and incrementality issues. The practical version of S-D thresholding has been tested in the context of the TREC-9 Filtering Track and found to be very effective [2].