Algorithms for clustering data
Algorithms for clustering data
Ten lectures on wavelets
Neural networks and the bias/variance dilemma
Neural Computation
Neural Computation
Information-based objective functions for active data selection
Neural Computation
Neural Computation
Bayesian regularization and pruning using a Laplace prior
Neural Computation
The nature of statistical learning theory
The nature of statistical learning theory
Self-organizing maps
Bias/variance decompositions for likelihood-based estimators
Neural Computation
An equivalence between sparse approximation and support vector machines
Neural Computation
Advances in kernel methods: support vector learning
Advances in kernel methods: support vector learning
Atomic Decomposition by Basis Pursuit
SIAM Journal on Scientific Computing
Prediction with Gaussian processes: from linear regression to linear prediction and beyond
Learning in graphical models
An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
Neural Networks for Pattern Recognition
Neural Networks for Pattern Recognition
A Tutorial on Support Vector Machines for Pattern Recognition
Data Mining and Knowledge Discovery
A unified method for optimizing linear image restoration filters
Signal Processing - Image and Video Coding beyond Standards
Ridge Regression Learning Algorithm in Dual Variables
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Sparse Greedy Matrix Approximation for Machine Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)
Subspace Information Criterion for Model Selection
Neural Computation
Algebraic Analysis for Nonidentifiable Learning Machines
Neural Computation
Incremental Active Learning for Optimal Generalization
Neural Computation
Neural Computation
Bounds on Error Expectation for Support Vector Machines
Neural Computation
Active learning with statistical models
Journal of Artificial Intelligence Research
Fisher information and stochastic complexity
IEEE Transactions on Information Theory
A decision-theoretic extension of stochastic complexity and its applications to learning
IEEE Transactions on Information Theory
De-noising by soft-thresholding
IEEE Transactions on Information Theory
Model complexity control for regression using VC generalization bounds
IEEE Transactions on Neural Networks
Statistical active learning in multilayer perceptrons
IEEE Transactions on Neural Networks
An introduction to kernel-based learning algorithms
IEEE Transactions on Neural Networks
Subspace information criterion for nonquadratic regularizers-Model selection for sparse regressors
IEEE Transactions on Neural Networks
Generalization Error Estimation for Non-linear Learning Methods
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
A New Meta-Criterion for Regularized Subspace Information Criterion
IEICE - Transactions on Information and Systems
Kernel Wiener filter and its application to pattern recognition
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
A central problem in learning is selection of an appropriate model. This is typically done by estimating the unknown generalization errors of a set of models to be selected from and then choosing the model with minimal generalization error estimate. In this article, we discuss the problem of model selection and generalization error estimation in the context of kernel regression models, e.g., kernel ridge regression, kernel subset regression or Gaussian process regression. Previously, a non-asymptotic generalization error estimator called the subspace information criterion (SIC) was proposed, that could be successfully applied to finite dimensional subspace models. SIC is an unbiased estimator of the generalization error for the finite sample case under the conditions that the learning target function belongs to a specified reproducing kernel Hilbert space (RKHS) H and the reproducing kernels centered on training sample points span the whole space H. These conditions hold only if dim H l, where l H, SIC is an unbiased estimator of an essential part of the generalization error. Our extension allows the use of any RKHSs including infinite dimensional ones, i.e., richer function classes commonly used in Gaussian processes, support vector machines or boosting. We further show that when the kernel matrix is invertible, SIC can be expressed in a much simpler form, making its computation highly efficient. In computer simulations on ridge parameter selection with real and artificial data sets, SIC is compared favorably with other standard model selection techniques for instance leave-one-out cross-validation or an empirical Bayesian method.