Training with noise is equivalent to Tikhonov regularization

  • Authors:
  • Chris M. Bishop

  • Affiliations:
  • -

  • Venue:
  • Neural Computation
  • Year:
  • 1995

Quantified Score

Hi-index 0.00

Visualization

Abstract

It is well known that the addition of noise to the input data ofa neural network during training can, in some circumstances, leadto significant improvements in generalization performance. Previouswork has shown that such training with noise is equivalent to aform of regularization in which an extra term is added to the errorfunction. However, the regularization term, which involves secondderivatives of the error function, is not bounded below, and so canlead to difficulties if used directly in a learning algorithm basedon error minimization. In this paper we show that for the purposesof network training, the regularization term can be reduced to apositive semi-definite form that involves only first derivatives ofthe network mapping. For a sum-of-squares error function, theregularization term belongs to the class of generalized Tikhonovregularizers. Direct minimization of the regularized error functionprovides a practical alternative to training with noise.