A sequential algorithm for training text classifiers
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Training algorithms for linear text classifiers
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Large Margin Classification Using the Perceptron Algorithm
Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Hierarchical classification of Web content
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Modeling score distributions for combining the outputs of search engines
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Active Sampling for Class Probability Estimation and Ranking
Machine Learning
Probabilistic score estimation with piecewise logistic regression
ICML '04 Proceedings of the twenty-first international conference on Machine learning
An analysis of the relative hardness of Reuters-21578 subsets: Research Articles
Journal of the American Society for Information Science and Technology
Local sparsity control for naive Bayes with extreme misclassification costs
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
ROC confidence bands: an empirical evaluation
ICML '05 Proceedings of the 22nd international conference on Machine learning
Generating query substitutions
Proceedings of the 15th international conference on World Wide Web
Extreme value theory applied to document retrieval from large collections
Information Retrieval
Partitioned logistic regression for spam filtering
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Neighborhood-Based Local Sensitivity
ECML '07 Proceedings of the 18th European conference on Machine Learning
Modeling the Score Distributions of Relevant and Non-relevant Documents
ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
Score distribution models: assumptions, intuition, and robustness to score manipulation
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Online stratified sampling: evaluating classifiers at web-scale
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Variational bayes for modeling score distributions
Information Retrieval
Smooth receiver operating characteristics (smROC) curves
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
A term weighting approach for text categorization
AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
Counting positives accurately despite inaccurate classification
ECML'05 Proceedings of the 16th European conference on Machine Learning
Hi-index | 0.00 |
Text classifiers that give probability estimates are more readily applicable in a variety of scenarios. For example, rather than choosing one set decision threshold, they can be used in a Bayesian risk model to issue a run-time decision which minimizes a user-specified cost function dynamically chosen at prediction time. However, the quality of the probability estimates is crucial. We review a variety of standard approaches to converting scores (and poor probability estimates) from text classifiers to high quality estimates and introduce new models motivated by the intuition that the empirical score distribution for the "extremely irrelevant", "hard to discriminate", and "obviously relevant" items are often significantly different. Finally, we analyze the experimental performance of these models over the outputs of two text classifiers. The analysis demonstrates that one of these models is theoretically attractive (introducing few new parameters while increasing flexibility), computationally efficient, and empirically preferable.