What Size Test Set Gives Good Error Rate Estimates?

Authors:
Isabelle Guyon;John Markhoul;Richard Schwartz;Vladimir Vapnik
Affiliations:
-;BBN Systems and Technologies, Cambridge, MA;BBN Systems and Technologies, Cambridge, MA;AT&T Labs, Red Bank, NJ
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
1998

Citing 0
Cited 21

Performance characterisation in computer vision: statistics in testing and design

Imaging and vision systems
Gene Selection for Cancer Classification using Support Vector Machines

Machine Learning
Classification with Synaptic Radial Basis Units

IWANN '01 Proceedings of the 6th International Work-Conference on Artificial and Natural Neural Networks: Connectionist Models of Neurons, Learning Processes and Artificial Intelligence-Part I
Bounds for validation

Fundamenta Informaticae
On-line signature recognition based on VQ-DTW

Pattern Recognition
Generalized training subset selection for statistical estimation of epicardial activation maps from intravenous catheter measurements

Computers in Biology and Medicine
An efficient face verification method in a transformed domain

Pattern Recognition Letters
Performance characterization in computer vision: A guide to best practices

Computer Vision and Image Understanding
Multifeature knuckles parameterization

AIA '08 Proceedings of the 26th IASTED International Conference on Artificial Intelligence and Applications
An optimization on pictogram identification for the road-sign recognition task using SVMs

Computer Vision and Image Understanding
Constructing distributed hippocratic video databases for privacy-preserving online patient training and counseling

IEEE Transactions on Information Technology in Biomedicine
Non-negative tensor factorization applied to music genre classification

IEEE Transactions on Audio, Speech, and Language Processing
Auditory spectrum-based pitched instrument onset detection

IEEE Transactions on Audio, Speech, and Language Processing
Fast on-line signature recognition based on VQ with time modeling

Engineering Applications of Artificial Intelligence
A Comparative Study of Palmprint Recognition Algorithms

ACM Computing Surveys (CSUR)
Is enough enough? what is sufficiency in biometric data?

ICIAR'06 Proceedings of the Third international conference on Image Analysis and Recognition - Volume Part II
Model selection and assessment for classification using validation

RSFDGrC'05 Proceedings of the 10th international conference on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing - Volume Part I
Speaker-independent emotion recognition exploiting a psychologically-inspired binary cascade classification schema

International Journal of Speech Technology
Bounds for Validation

Fundamenta Informaticae
Effect of small sample size on text categorization with support vector machines

BioNLP '12 Proceedings of the 2012 Workshop on Biomedical Natural Language Processing
Learning small gallery size for prediction of recognition performance on large populations

Pattern Recognition

Quantified Score

Hi-index	0.14

Visualization

Abstract

We address the problem of determining what size test set guarantees statistically significant results in a character recognition task, as a function of the expected error rate. We provide a statistical analysis showing that if, for example, the expected character error rate is around 1 percent, then, with a test set of at least 10,000 statistically independent handwritten characters (which could be obtained by taking 100 characters from each of 100 different writers), we guarantee, with 95 percent confidence, that: (1) The expected value of the character error rate is not worse than 1.25 E, where E is the empirical character error rate of the best recognizer, calculated on the test set; and (2) a difference of 0.3 E between the error rates of two recognizers is significant. We developed this framework with character recognition applications in mind, but it applies as well to speech recognition and to other pattern recognition problems.