The dipping phenomenon

Authors:
Marco Loog;Robert P. W. Duin
Affiliations:
Pattern Recognition Laboratory, Delft University of Technology, Delft, The Netherlands;Pattern Recognition Laboratory, Delft University of Technology, Delft, The Netherlands
Venue:
SSPR'12/SPR'12 Proceedings of the 2012 Joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition
Year:
2012

Citing 9
Cited 2

Four types of learning curves

Neural Computation
Rigorous learning curve bounds from statistical mechanics

Machine Learning - Special issue on COLT '94
Expected classification error of the Fisher linear classifier with pseudo-inverse covariance matrix

Pattern Recognition Letters
Statistical Pattern Recognition: A Review

IEEE Transactions on Pattern Analysis and Machine Intelligence
Machine Learning as an Experimental Science

Machine Learning
Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)

Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)
Stabilizing Classifiers for Very Small Sample Sizes

ICPR '96 Proceedings of the 13th International Conference on Pattern Recognition - Volume 2
On the mean accuracy of statistical pattern recognizers

IEEE Transactions on Information Theory
Consistency of support vector machines and other regularized kernel classifiers

IEEE Transactions on Information Theory

Constrained log-likelihood-based semi-supervised linear discriminant analysis

SSPR'12/SPR'12 Proceedings of the 2012 Joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition
Semi-supervised linear discriminant analysis through moment-constraint parameter estimation

Pattern Recognition Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

One typically expects classifiers to demonstrate improved performance with increasing training set sizes or at least to obtain their best performance in case one has an infinite number of training samples at ones's disposal. We demonstrate, however, that there are classification problems on which particular classifiers attain their optimum performance at a training set size which is finite. Whether or not this phenomenon, which we term dipping, can be observed depends on the choice of classifier in relation to the underlying class distributions. We give some simple examples, for a few classifiers, that illustrate how the dipping phenomenon can occur. Additionally, we speculate about what generally is needed for dipping to emerge. What is clear is that this kind of learning curve behavior does not emerge due to mere chance and that the pattern recognition practitioner ought to take note of it.