No Unbiased Estimator of the Variance of K-Fold Cross-Validation

Authors:
Yoshua Bengio;Yves Grandvalet
Affiliations:
-;-
Venue:
The Journal of Machine Learning Research
Year:
2004

Citing 7
Cited 33

Cross-validation for binary classification by real-valued functions: theoretical analysis

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Approximate statistical tests for comparing supervised classification learning algorithms

Neural Computation
Beating the hold-out: bounds for K-fold and progressive cross-validation

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Algorithmic stability and sanity-check bounds for leave-one-out cross-validation

Neural Computation
Inference for the Generalization Error

Machine Learning
Combined 5 × 2 cv F Test for Comparing Supervised Classification Learning Algorithms

Neural Computation
A study of cross-validation and bootstrap for accuracy estimation and model selection

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2

Accuracy estimation with clustered dataset

AusDM '06 Proceedings of the fifth Australasian conference on Data mining and analystics - Volume 61
Classifying carpets based on laser scanner data

Engineering Applications of Artificial Intelligence
Estimating the Confidence Interval for Prediction Errors of Support Vector Machine Classifiers

The Journal of Machine Learning Research
FCANN: A new approach for extraction and representation of knowledge from ANN trained via Formal Concept Analysis

Neurocomputing
Predicting failures with developer networks and social network analysis

Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering
Semi-analytical method for analyzing models and model selection measures based on moment analysis

ACM Transactions on Knowledge Discovery from Data (TKDD)
A Unified Framework for MR Based Disease Classification

IPMI '09 Proceedings of the 21st International Conference on Information Processing in Medical Imaging
Short-term forecasting of emergency inpatient flow

IEEE Transactions on Information Technology in Biomedicine
A novel measure for evaluating classifiers

Expert Systems with Applications: An International Journal
A Survey of Accuracy Evaluation Metrics of Recommendation Tasks

The Journal of Machine Learning Research
Reliable prediction system based on support vector regression with genetic algorithms

ICNC'09 Proceedings of the 5th international conference on Natural computation
Improving Bayesian credibility intervals for classifier error rates using maximum entropy empirical priors

Artificial Intelligence in Medicine
Measuring the prediction error. A comparison of cross-validation, bootstrap and covariance penalty methods

Computational Statistics & Data Analysis
On Over-fitting in Model Selection and Subsequent Selection Bias in Performance Evaluation

The Journal of Machine Learning Research
Unsupervised Layer-Wise Model Selection in Deep Neural Networks

Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Parameter screening and optimisation for ILP using designed experiments

ILP'09 Proceedings of the 19th international conference on Inductive logic programming
An object-oriented library for systematic training and comparison of classifiers for computer-assisted tumor diagnosis from MRSI measurements

Computer Science - Research and Development
Parameter Screening and Optimisation for ILP using Designed Experiments

The Journal of Machine Learning Research
Building a qualitative recruitment system via SVM with MCDM approach

Applied Intelligence
Correcting bias in statistical tests for network classifier evaluation

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
The cross validation problem

COLT'05 Proceedings of the 18th annual conference on Learning Theory
Derivation of an artificial gene to improve classification accuracy upon gene selection

Computational Biology and Chemistry
Uncertainty estimation with a finite dataset in the assessment of classification models

Computational Statistics & Data Analysis
Classification of nuclear receptor subfamilies with RBF kernel in support vector machine

ISNN'05 Proceedings of the Second international conference on Advances in Neural Networks - Volume Part III
Classifier variability: Accounting for training and testing

Pattern Recognition
Resampling methods for meta-model validation with recommendations for evolutionary computation

Evolutionary Computation
A general framework for the statistical analysis of the sources of variance for classification error estimators

Pattern Recognition
Supervised pre-processing approaches in multiple class variables classification for fish recruitment forecasting

Environmental Modelling & Software
A novel divide-and-merge classification for high dimensional datasets

Computational Biology and Chemistry
Parallel multitask cross validation for Support Vector Machine using GPU

Journal of Parallel and Distributed Computing
Recommender systems survey

Knowledge-Based Systems
Towards minimizing the annotation cost of certified text classification

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Computer-aided diagnosis of mass-like lesion in breast MRI: Differential analysis of the 3-D morphology between benign and malignant tumors

Computer Methods and Programs in Biomedicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most machine learning researchers perform quantitative experiments to estimate generalization error and compare the performance of different algorithms (in particular, their proposed algorithm). In order to be able to draw statistically convincing conclusions, it is important to estimate the uncertainty of such estimates. This paper studies the very commonly used K-fold cross-validation estimator of generalization performance. The main theorem shows that there exists no universal (valid under all distributions) unbiased estimator of the variance of K-fold cross-validation. The analysis that accompanies this result is based on the eigen-decomposition of the covariance matrix of errors, which has only three different eigenvalues corresponding to three degrees of freedom of the matrix and three components of the total variance. This analysis helps to better understand the nature of the problem and how it can make naive estimators (that don't take into account the error correlations due to the overlap between training and test sets) grossly underestimate variance. This is confirmed by numerical experiments in which the three components of the variance are compared when the difficulty of the learning problem and the number of folds are varied.