Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
A sequential algorithm for training text classifiers
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Online computation and competitive analysis
Online computation and competitive analysis
Efficient progressive sampling
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Information Retrieval
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Combining Trainig Set and Test Set Bounds
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Inference for the Generalization Error
Machine Learning
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
No Unbiased Estimator of the Variance of K-Fold Cross-Validation
The Journal of Machine Learning Research
Tutorial on Practical Prediction Theory for Classification
The Journal of Machine Learning Research
A support vector method for multivariate performance measures
ICML '05 Proceedings of the 22nd international conference on Machine learning
Training linear SVMs in linear time
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Cross-validation and bootstrapping are unreliable in small sample classification
Pattern Recognition Letters
Sparse Kernel SVMs via Cutting-Plane Training
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Reinforcement learning: a survey
Journal of Artificial Intelligence Research
Evaluation of information retrieval for E-discovery
Artificial Intelligence and Law
Bibliography on estimation of misclassification
IEEE Transactions on Information Theory
Approximate Recall Confidence Intervals
ACM Transactions on Information Systems (TOIS)
Sequential testing in classifier evaluation yields biased estimates of effectiveness
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Hi-index | 0.00 |
The common practice of testing a sequence of text classifiers learned on a growing training set, and stopping when a target value of estimated effectiveness is first met, introduces a sequential testing bias. In settings where the effectiveness of a text classifier must be certified (perhaps to a court of law), this bias may be unacceptable. The choice of when to stop training is made even more complex when, as is common, the annotation of training and test data must be paid for from a common budget: each new labeled training example is a lost test example. Drawing on ideas from statistical power analysis, we present a framework for joint minimization of training and test annotation that maintains the statistical validity of effectiveness estimates, and yields a natural definition of an optimal allocation of annotations to training and test data. We identify the development of allocation policies that can approximate this optimum as a central question for research. We then develop simulation-based power analysis methods for van Rijsbergen's F-measure, and incorporate them in four baseline allocation policies which we study empirically. In support of our studies, we develop a new analytic approximation of confidence intervals for the F-measure that is of independent interest.