Comparison of model selection for regression

Authors:
Vladimir Cherkassky;Yunqian Ma
Affiliations:
Department of Electrical and Computer Engineering, University of Minnesota, Minneapolis, MN;Department of Electrical and Computer Engineering, University of Minnesota, Minneapolis, MN
Venue:
Neural Computation
Year:
2003

Citing 11
Cited 13

Measuring the VC-dimension of a learning machine

Neural Computation
The nature of statistical learning theory

The nature of statistical learning theory
Signal estimation and denoising using VC-theory

Neural Networks
Neural Networks for Pattern Recognition

Neural Networks for Pattern Recognition
Pattern Recognition and Neural Networks

Pattern Recognition and Neural Networks
Learning from Data: Concepts, Theory, and Methods

Learning from Data: Concepts, Theory, and Methods
Model Selection for Small Sample Regression

Machine Learning
Myopotential denoising of ECG signals using wavelet thresholding methods

Neural Networks
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Measuring the VC-Dimension Using Optimized Experimental Design

Neural Computation
Model complexity control for regression using VC generalization bounds

IEEE Transactions on Neural Networks

Note on "Comparison of model selection for regression" by Vladimir Cherkassky and Yunqian Ma

Neural Computation
Practical selection of SVM parameters and noise estimation for SVM regression

Neural Networks
Model Selection for Unsupervised Learning of Visual Context

International Journal of Computer Vision
Estimation of the conditional risk in classification: The swapping method

Computational Statistics & Data Analysis
Adaptation, Performance and Vapnik-Chervonenkis Dimension of Straight Line Programs

EuroGP '09 Proceedings of the 12th European Conference on Genetic Programming
A guide for multilevel modeling of dyadic data with binary outcomes using SAS PROC NLMIXED

Computational Statistics & Data Analysis
Image Denoising with Kernels Based on Natural Image Relations

The Journal of Machine Learning Research
Model selection in genetic programming

Proceedings of the 12th annual conference on Genetic and evolutionary computation
Penalty functions for genetic programming algorithms

ICCSA'11 Proceedings of the 2011 international conference on Computational science and its applications - Volume Part I
SLIT: designing complexity penalty for classification and regression trees using the SRM principle

ISNN'06 Proceedings of the Third international conference on Advances in Neural Networks - Volume Part I
An alternative approach to avoid overfitting for surrogate models

Proceedings of the Winter Simulation Conference
A backward elimination discrete optimization algorithm for model selection in spatio-temporal regression models

Environmental Modelling & Software
Generalization ability of fractional polynomial models

Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

We discuss empirical comparison of analytical methods for model selection. Currently, there is no consensus on the best method for finite-sample estimation problems, even for the simple case of linear estimators. This article presents empirical comparisons between classical statistical methods--Akaike information criterion (AIC) and Bayesian information criterion (BIC)--and the structural risk minimization (SRM) method, based on Vapnik-Chervonenkis (VC) theory, for regression problems. Our study is motivated by empirical comparisons in Hastie, Tibshirani, and Friedman (2001), which claims that the SRM method performs poorly for model selection and suggests that AIC yields superior predictive performance. Hence, we present empirical comparisons for various data sets and different types of estimators (linear, subset selection, and k-nearest neighbor regression). Our results demonstrate the practical advantages of VC-based model selection; it consistently outperforms AIC for all data sets. In our study, SRM and BIC methods show similar predictive performance. This discrepancy (between empirical results obtained using the same data) is caused by methodological drawbacks in Hastie et al. (2001), especially in their loose interpretation and application of SRM method. Hence, we discuss methodological issues important for meaningful comparisons and practical application of SRM method. We also point out the importance of accurate estimation of model complexity (VC-dimension) for empirical comparisons and propose a new practical estimate of model complexity for k-nearest neighbors regression.