Noise injection: theoretical prospects
Neural Computation
On different facets of regularization theory
Neural Computation
ICML '06 Proceedings of the 23rd international conference on Machine learning
Invariances in kernel methods: From samples to objects
Pattern Recognition Letters
Invariant kernel functions for pattern analysis and machine learning
Machine Learning
A smoothing regularizer for feedforward and recurrent neural networks
Neural Computation
Hi-index | 0.01 |
Ideally pattern recognition machines provide constant outputwhen the inputs are transformed under a group G of desiredinvariances. These invariances can be achieved by enhancing thetraining data to include examples of inputs transformed by elementsof G, while leaving the corresponding targets unchanged.Alternatively the cost function for training can include aregularization term that penalizes changes in the output when theinput is transformed under the group. This paper relates the twoapproaches, showing precisely the sense in which the regularizedcost function approximates the result of adding transformedexamples to the training data. We introduce the notion of aprobability distribution over the group transformations, and usethis to rewrite the cost function for the enhanced training data.Under certain conditions, the new cost function is equivalent tothe sum of the original cost function plus a regularizer. Forunbiased models, the regularizer reduces to the intuitively obviouschoice---a term that penalizes changes in the output when theinputs are transformed under the group. For infinitesimaltransformations, the coefficient of the regularization term reducesto the variance of the distortions introduced into the trainingdata. This correspondence provides a simple bridge between the twoapproaches.