Classifier variability: Accounting for training and testing

Authors:
Weijie Chen;Brandon D. Gallas;Waleed A. Yousef
Affiliations:
Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, Food and Drug Administration, 10903 New Hampshire Avenue, Silver Spring, MD 20993, United States;Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, Food and Drug Administration, 10903 New Hampshire Avenue, Silver Spring, MD 20993, United States;Human Computer Interaction Lab., Faculty of Computers and Information, Helwan University, Egypt
Venue:
Pattern Recognition
Year:
2012

Citing 12
Cited 1

Estimation of Classifier Performance

IEEE Transactions on Pattern Analysis and Machine Intelligence
Introduction to statistical pattern recognition (2nd ed.)

Introduction to statistical pattern recognition (2nd ed.)
NETLAB: algorithms for pattern recognition

NETLAB: algorithms for pattern recognition
An introduction to variable and feature selection

The Journal of Machine Learning Research
A General Model for Finite-Sample Effects in Training and Testing of Competing Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
No Unbiased Estimator of the Variance of K-Fold Cross-Validation

The Journal of Machine Learning Research
Comparison of Non-Parametric Methods for Assessing Classifier Performance in Terms of ROC Parameters

AIPR '04 Proceedings of the 33rd Applied Imagery Pattern Recognition Workshop
An introduction to ROC analysis

Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
Assessing Classifiers from Two Independent Data Sets Using ROC Analysis: A Nonparametric Approach

IEEE Transactions on Pattern Analysis and Machine Intelligence
Small-sample precision of ROC-related estimates

Bioinformatics
Small-sample precision of ROC-related estimates

Bioinformatics
The use of the area under the ROC curve in the evaluation of machine learning algorithms

Pattern Recognition

Assessing classifiers in terms of the partial area under the ROC curve

Computational Statistics & Data Analysis

Quantified Score

Hi-index	0.01

Visualization

Abstract

We categorize the statistical assessment of classifiers into three levels: assessing the classification performance and its testing variability conditional on a fixed training set, assessing the performance and its variability that accounts for both training and testing, and assessing the performance averaging over training sets and its variability that accounts for both training and testing. We derived analytical expressions for the variance of the estimated AUC and provide freely available software implemented with an efficient computation algorithm. Our approach can be applied to assess any classifier that has ordinal (continuous or discrete) outputs. Applications to simulated and real datasets are presented to illustrate our methods.