From data distributions to regularization in invariant learning

  • Authors:
  • Todd K. Leen

  • Affiliations:
  • -

  • Venue:
  • Neural Computation
  • Year:
  • 1995

Quantified Score

Hi-index 0.01

Visualization

Abstract

Ideally pattern recognition machines provide constant outputwhen the inputs are transformed under a group G of desiredinvariances. These invariances can be achieved by enhancing thetraining data to include examples of inputs transformed by elementsof G, while leaving the corresponding targets unchanged.Alternatively the cost function for training can include aregularization term that penalizes changes in the output when theinput is transformed under the group. This paper relates the twoapproaches, showing precisely the sense in which the regularizedcost function approximates the result of adding transformedexamples to the training data. We introduce the notion of aprobability distribution over the group transformations, and usethis to rewrite the cost function for the enhanced training data.Under certain conditions, the new cost function is equivalent tothe sum of the original cost function plus a regularizer. Forunbiased models, the regularizer reduces to the intuitively obviouschoice---a term that penalizes changes in the output when theinputs are transformed under the group. For infinitesimaltransformations, the coefficient of the regularization term reducesto the variance of the distortions introduced into the trainingdata. This correspondence provides a simple bridge between the twoapproaches.