Towards minimizing the annotation cost of certified text classification

Authors:
Mossaab Bagdouri;William Webber;David D. Lewis;Douglas W. Oard
Affiliations:
University of Maryland, College Park, MD, USA;University of Maryland, College Park, MD, USA;David D. Lewis Consulting, Chicago, IL, USA;University of Maryland, College Park, MD, USA
Venue:
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Year:
2013

Citing 21
Cited 0

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
A sequential algorithm for training text classifiers

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Online computation and competitive analysis

Online computation and competitive analysis
Efficient progressive sampling

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
A re-examination of text categorization methods

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Information Retrieval

Information Retrieval
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Combining Trainig Set and Test Set Bounds

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Inference for the Generalization Error

Machine Learning
RCV1: A New Benchmark Collection for Text Categorization Research

The Journal of Machine Learning Research
No Unbiased Estimator of the Variance of K-Fold Cross-Validation

The Journal of Machine Learning Research
Tutorial on Practical Prediction Theory for Classification

The Journal of Machine Learning Research
A support vector method for multivariate performance measures

ICML '05 Proceedings of the 22nd international conference on Machine learning
Training linear SVMs in linear time

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Cross-validation and bootstrapping are unreliable in small sample classification

Pattern Recognition Letters
Sparse Kernel SVMs via Cutting-Plane Training

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
Evaluation of information retrieval for E-discovery

Artificial Intelligence and Law
Bibliography on estimation of misclassification

IEEE Transactions on Information Theory
Approximate Recall Confidence Intervals

ACM Transactions on Information Systems (TOIS)
Sequential testing in classifier evaluation yields biased estimates of effectiveness

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

The common practice of testing a sequence of text classifiers learned on a growing training set, and stopping when a target value of estimated effectiveness is first met, introduces a sequential testing bias. In settings where the effectiveness of a text classifier must be certified (perhaps to a court of law), this bias may be unacceptable. The choice of when to stop training is made even more complex when, as is common, the annotation of training and test data must be paid for from a common budget: each new labeled training example is a lost test example. Drawing on ideas from statistical power analysis, we present a framework for joint minimization of training and test annotation that maintains the statistical validity of effectiveness estimates, and yields a natural definition of an optimal allocation of annotations to training and test data. We identify the development of allocation policies that can approximate this optimum as a central question for research. We then develop simulation-based power analysis methods for van Rijsbergen's F-measure, and incorporate them in four baseline allocation policies which we study empirically. In support of our studies, we develop a new analytic approximation of confidence intervals for the F-measure that is of independent interest.