Cross-validation for binary classification by real-valued functions: theoretical analysis
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Beating the hold-out: bounds for K-fold and progressive cross-validation
COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Inference for the Generalization Error
Machine Learning
A study of cross-validation and bootstrap for accuracy estimation and model selection
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Accuracy estimation with clustered dataset
AusDM '06 Proceedings of the fifth Australasian conference on Data mining and analystics - Volume 61
Classifying carpets based on laser scanner data
Engineering Applications of Artificial Intelligence
Estimating the Confidence Interval for Prediction Errors of Support Vector Machine Classifiers
The Journal of Machine Learning Research
Predicting failures with developer networks and social network analysis
Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering
Semi-analytical method for analyzing models and model selection measures based on moment analysis
ACM Transactions on Knowledge Discovery from Data (TKDD)
A Unified Framework for MR Based Disease Classification
IPMI '09 Proceedings of the 21st International Conference on Information Processing in Medical Imaging
Short-term forecasting of emergency inpatient flow
IEEE Transactions on Information Technology in Biomedicine
A novel measure for evaluating classifiers
Expert Systems with Applications: An International Journal
A Survey of Accuracy Evaluation Metrics of Recommendation Tasks
The Journal of Machine Learning Research
Reliable prediction system based on support vector regression with genetic algorithms
ICNC'09 Proceedings of the 5th international conference on Natural computation
Artificial Intelligence in Medicine
Computational Statistics & Data Analysis
On Over-fitting in Model Selection and Subsequent Selection Bias in Performance Evaluation
The Journal of Machine Learning Research
Unsupervised Layer-Wise Model Selection in Deep Neural Networks
Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Parameter screening and optimisation for ILP using designed experiments
ILP'09 Proceedings of the 19th international conference on Inductive logic programming
Computer Science - Research and Development
Parameter Screening and Optimisation for ILP using Designed Experiments
The Journal of Machine Learning Research
Building a qualitative recruitment system via SVM with MCDM approach
Applied Intelligence
Correcting bias in statistical tests for network classifier evaluation
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
COLT'05 Proceedings of the 18th annual conference on Learning Theory
Derivation of an artificial gene to improve classification accuracy upon gene selection
Computational Biology and Chemistry
Uncertainty estimation with a finite dataset in the assessment of classification models
Computational Statistics & Data Analysis
Classification of nuclear receptor subfamilies with RBF kernel in support vector machine
ISNN'05 Proceedings of the Second international conference on Advances in Neural Networks - Volume Part III
Classifier variability: Accounting for training and testing
Pattern Recognition
Resampling methods for meta-model validation with recommendations for evolutionary computation
Evolutionary Computation
Environmental Modelling & Software
A novel divide-and-merge classification for high dimensional datasets
Computational Biology and Chemistry
Parallel multitask cross validation for Support Vector Machine using GPU
Journal of Parallel and Distributed Computing
Knowledge-Based Systems
Towards minimizing the annotation cost of certified text classification
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Computer Methods and Programs in Biomedicine
Hi-index | 0.00 |
Most machine learning researchers perform quantitative experiments to estimate generalization error and compare the performance of different algorithms (in particular, their proposed algorithm). In order to be able to draw statistically convincing conclusions, it is important to estimate the uncertainty of such estimates. This paper studies the very commonly used K-fold cross-validation estimator of generalization performance. The main theorem shows that there exists no universal (valid under all distributions) unbiased estimator of the variance of K-fold cross-validation. The analysis that accompanies this result is based on the eigen-decomposition of the covariance matrix of errors, which has only three different eigenvalues corresponding to three degrees of freedom of the matrix and three components of the total variance. This analysis helps to better understand the nature of the problem and how it can make naive estimators (that don't take into account the error correlations due to the overlap between training and test sets) grossly underestimate variance. This is confirmed by numerical experiments in which the three components of the variance are compared when the difficulty of the learning problem and the number of folds are varied.