Direct transfer of learned information among neural networks

Authors:
Lorien Y. Pratt;Jack Mostow;Candace A. Kamm
Affiliations:
Computer Science Department, Rutgers University, New Brunswick, NJ;Computer Science Department, Rutgers University, New Brunswick, NJ;Speech Technology Research, Morristown, NJ
Venue:
AAAI'91 Proceedings of the ninth National conference on Artificial intelligence - Volume 2
Year:
1991

Citing 3
Cited 3

Recognition of semantically incorrect rules: a neural-network approach

IEA/AIE '90 Proceedings of the 3rd international conference on Industrial and engineering applications of artificial intelligence and expert systems - Volume 2
Learning internal representations by error propagation

Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
Symbolic and Neural Learning Algorithms: An Experimental Comparison

Machine Learning

Using previous models to bias structural learning in the hierarchical boa

Evolutionary Computation
Distance-based bias in model-directed optimization of additively decomposable problems

Proceedings of the 14th annual conference on Genetic and evolutionary computation
Transfer learning, soft distance-based bias, and the hierarchical BOA

PPSN'12 Proceedings of the 12th international conference on Parallel Problem Solving from Nature - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

A touted advantage of symbolic representations is the ease of transferring learned information from one intelligent agent to another. This paper investigates an analogous problem: how to use information from one neural network to help a second network learn a related task. Rather than translate such information into symbolic form (in which it may not be readily expressible), we investigate the direct transfer of information encoded as weights. Here, we focus on how transfer can be used to address the important problem of improving neural network learning speed. First we present an exploratory study of the somewhat surprising effects of pre-setting network weights on subsequent learning. Guided by hypotheses from this study, we sped up back-propagation learning for two speech recognition tasks. By transferring weights from smaller networks trained on subtasks, we achieved speedups of up to an order of magnitude compared with training starting with random weights, even taking into account the time to train the smaller networks. We include results on how transfer scales to a large phoneme recognition problem.