Statistical aspects of model selection
From data to model
Learning internal representations by error propagation
Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
A resource-allocating network for function interpolation
Neural Computation
Neural networks and the bias/variance dilemma
Neural Computation
Neural Computation
The nature of statistical learning theory
The nature of statistical learning theory
Variable selection for the growth curve model
Journal of Multivariate Analysis
Advances in kernel methods: support vector learning
Advances in kernel methods: support vector learning
Handbook of Mathematical Functions, With Formulas, Graphs, and Mathematical Tables,
Handbook of Mathematical Functions, With Formulas, Graphs, and Mathematical Tables,
Fisher information and stochastic complexity
IEEE Transactions on Information Theory
A decision-theoretic extension of stochastic complexity and its applications to learning
IEEE Transactions on Information Theory
Model complexity control for regression using VC generalization bounds
IEEE Transactions on Neural Networks
A unified method for optimizing linear image restoration filters
Signal Processing - Image and Video Coding beyond Standards
Selecting Ridge Parameters in Infinite Dimensional Hypothesis Spaces
ICANN '02 Proceedings of the International Conference on Artificial Neural Networks
The subspace information criterion for infinite dimensional hypothesis spaces
The Journal of Machine Learning Research
Adaptive blind separation with an unknown number of sources
Neural Computation
Learning States and Rules for Detecting Anomalies in Time Series
Applied Intelligence
Wavelet based approach to cluster analysis. Application on low dimensional data sets
Pattern Recognition Letters
Integrated kernels and their properties
Pattern Recognition
Covariate Shift Adaptation by Importance Weighted Cross Validation
The Journal of Machine Learning Research
Optimal Kernel in a Class of Kernels with an Invariant Metric
SSPR & SPR '08 Proceedings of the 2008 Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
Generalization Error Estimation for Non-linear Learning Methods
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
A New Meta-Criterion for Regularized Subspace Information Criterion
IEICE - Transactions on Information and Systems
Evaluating learning algorithms and classifiers
International Journal of Intelligent Information and Database Systems
Kernel Width Optimization for Faulty RBF Neural Networks with Multi-node Open Fault
Neural Processing Letters
Model selection using a class of kernels with an invariant metric
SSPR'06/SPR'06 Proceedings of the 2006 joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition
Extended analyses for an optimal kernel in a class of kernels with an invariant metric
SSPR'12/SPR'12 Proceedings of the 2012 Joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition
Estimating the predominant number of clusters in a dataset
Intelligent Data Analysis
Hi-index | 0.00 |
The problem of model selection is considerably important for acquiring higher levels of generalization capability in supervised learning. In this article, we propose a new criterion for model selection, the subspace information criterion (SIC), which is a generalization of Mallows's CL. It is assumed that the learning target function belongs to a specified functional Hilbert space and the generalization error is defined as the Hilbert space squared norm of the difference between the learning result function and target function. SIC gives an unbiased estimate of the generalization error so defined. SIC assumes the availability of an unbiased estimate of the target function and the noise covariance matrix, which are generally unknown. A practical calculation method of SIC for least-mean-squares learning is provided under the assumption that the dimension of the Hilbert space is less than the number of training examples. Finally, computer simulations in two examples show that SIC works well even when the number of training examples is small.