An equivalence between sparse approximation and support vector machines
Neural Computation
An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
Machine Learning
Learning from Data: Concepts, Theory, and Methods
Learning from Data: Concepts, Theory, and Methods
Neural Networks: Tricks of the Trade, this book is an outgrowth of a 1996 NIPS workshop
On different facets of regularization theory
Neural Computation
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Learning the Kernel Function via Regularization
The Journal of Machine Learning Research
Neural Networks: A Comprehensive Foundation (3rd Edition)
Neural Networks: A Comprehensive Foundation (3rd Edition)
Model complexity control for regression using VC generalization bounds
IEEE Transactions on Neural Networks
Hi-index | 0.01 |
It is well known that the generalization capability is one of the most important criterions to develop and evaluate a classifier for a given pattern classification problem. The localized generalization error model (R"S"M) recently proposed by Ng et al. [Localized generalization error and its application to RBFNN training, in: Proceedings of the International Conference on Machine Learning and Cybernetics, China, 2005; Image classification with the use of radial basis function neural networks and the minimization of the localized generalization error, Pattern Recognition 40(1) (2007) 4-18] provides a more intuitive look at the generalization error. Although R"S"M gives a brand-new method to promote the generalization performance, it is in nature equivalent to another type of regularization. In this paper, we first prove the essential relationship between R"S"M and regularization, and demonstrate that the stochastic sensitivity measure in R"S"M exactly corresponds to a regularizing term. Then, we develop a new generalization error bound from the regularization viewpoint, which is inspired by the proved relationship between R"S"M and regularization. Moreover, we derive a new regularization method, called as locality regularization (LR), from the bound. Different from the existing regularization methods which artificially and externally append the regularizing term in order to smooth the solution, LR is naturally and internally deduced from the defined expected risk functional and calculated by employing locality information. Through combining with spectral graph theory, LR introduces the local structure information of the samples into the regularizing term and further improves the generalization capability. In contrast with R"S"M, which is relatively sensitive to the different sampling of the samples, LR uses the discrete k-neighborhood rather than the common continuous Q-neighborhood in R"S"M to differentiate the relative position of different training samples automatically and avoid the complex computation of Q for various classifiers. Furthermore, LR uses the regularization parameter to control the trade-off between the training accuracy and the classifier stability. Experimental results on artificial and real world problems show that LR yields better generalization capability than both R"S"M and some traditional regularization methods.