Direct transfer of learned information among neural networks

  • Authors:
  • Lorien Y. Pratt;Jack Mostow;Candace A. Kamm

  • Affiliations:
  • Computer Science Department, Rutgers University, New Brunswick, NJ;Computer Science Department, Rutgers University, New Brunswick, NJ;Speech Technology Research, Morristown, NJ

  • Venue:
  • AAAI'91 Proceedings of the ninth National conference on Artificial intelligence - Volume 2
  • Year:
  • 1991

Quantified Score

Hi-index 0.00

Visualization

Abstract

A touted advantage of symbolic representations is the ease of transferring learned information from one intelligent agent to another. This paper investigates an analogous problem: how to use information from one neural network to help a second network learn a related task. Rather than translate such information into symbolic form (in which it may not be readily expressible), we investigate the direct transfer of information encoded as weights. Here, we focus on how transfer can be used to address the important problem of improving neural network learning speed. First we present an exploratory study of the somewhat surprising effects of pre-setting network weights on subsequent learning. Guided by hypotheses from this study, we sped up back-propagation learning for two speech recognition tasks. By transferring weights from smaller networks trained on subtasks, we achieved speedups of up to an order of magnitude compared with training starting with random weights, even taking into account the time to train the smaller networks. We include results on how transfer scales to a large phoneme recognition problem.