A result of Vapnik with applications
Discrete Applied Mathematics
The nature of statistical learning theory
The nature of statistical learning theory
Chernoff-Hoeffding Bounds for Applications with Limited Independence
SIAM Journal on Discrete Mathematics
A framework for structural risk minimisation
COLT '96 Proceedings of the ninth annual conference on Computational learning theory
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
Concrete Mathematics: A Foundation for Computer Science
Concrete Mathematics: A Foundation for Computer Science
Covering number bounds of certain regularized linear function classes
The Journal of Machine Learning Research
Feature selection, L1 vs. L2 regularization, and rotational invariance
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Estimation of Dependences Based on Empirical Data: Empirical Inference Science (Information Science and Statistics)
Maximal Margin Estimation with Perceptron-Like Algorithm
ICAISC '08 Proceedings of the 9th international conference on Artificial Intelligence and Soft Computing
A study of cross-validation and bootstrap for accuracy estimation and model selection
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Covering numbers for real-valued function classes
IEEE Transactions on Information Theory
IEEE Transactions on Information Theory
Hi-index | 0.00 |
Two known approaches to complexity selection are taken under consideration: n-fold cross-validation and structural risk minimization. Obviously, in either approach, a discrepancy between the indicated optimal complexity (indicated as the minimum of a generalization error estimate or a bound) and the genuine minimum of unknown true risks is possible. In the paper, this problem is posed in a novel quantitative way. We state and prove theorems demonstrating how one can calculate pessimistic probabilities of discrepancy between these minima for given for given conditions of an experiment. The probabilities are calculated in terms of all relevant constants: the sample size, the number of cross-validation folds, the capacity of the set of approximating functions and bounds on this set. We report experiments carried out to validate the results.