Model selection in kernel ridge regression

Authors:
Peter Exterkate
Affiliations:
-
Venue:
Computational Statistics & Data Analysis
Year:
2013

Citing 8
Cited 0

A training algorithm for optimal margin classifiers

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
The connection between regularization operators and support vector kernels

Neural Networks
Nonlinear component analysis as a kernel eigenvalue problem

Neural Computation
Semi-parametric nonlinear regression and transformation using functional networks

Computational Statistics & Data Analysis
Efficient approximate leave-one-out cross-validation for kernel logistic regression

Machine Learning
Asymptotic normality of support vector machine variants and other regularized kernel methods

Journal of Multivariate Analysis
Tuning parameter selection in sparse regression modeling

Computational Statistics & Data Analysis
Multivariate regression shrinkage and selection by canonical correlation analysis

Computational Statistics & Data Analysis

Quantified Score

Hi-index	0.03

Visualization

Abstract

Kernel ridge regression is a technique to perform ridge regression with a potentially infinite number of nonlinear transformations of the independent variables as regressors. This method is gaining popularity as a data-rich nonlinear forecasting tool, which is applicable in many different contexts. The influence of the choice of kernel and the setting of tuning parameters on forecast accuracy is investigated. Several popular kernels are reviewed, including polynomial kernels, the Gaussian kernel, and the Sinc kernel. The latter two kernels are interpreted in terms of their smoothing properties, and the tuning parameters associated to all these kernels are related to smoothness measures of the prediction function and to the signal-to-noise ratio. Based on these interpretations, guidelines are provided for selecting the tuning parameters from small grids using cross-validation. A Monte Carlo study confirms the practical usefulness of these rules of thumb. Finally, the flexible and smooth functional forms provided by the Gaussian and Sinc kernels make them widely applicable. Therefore, their use is recommended instead of the popular polynomial kernels in general settings, where no information on the data-generating process is available.