Artificial neural network reduction through oracle learning

  • Authors:
  • Joshua E. Menke;Tony R. Martinez

  • Affiliations:
  • (Correspd. Tel.: +1 801 422 3027/ Fax: +1 801 422 0169/ E-mail: josh@axon.cs.byu.edu) Computer Science Department, Brigham Young University, Provo, UT, USA;Computer Science Department, Brigham Young University, Provo, UT, USA

  • Venue:
  • Intelligent Data Analysis
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

Often the best model to solve a real-world problem is relatively complex. This paper presents oracle learning, a method using a larger model as an oracle to train a smaller model on unlabeled data in order to obtain (1) a smaller acceptable model and (2) improved results over standard training methods on a similarly sized smaller model. In particular, this paper looks at oracle learning as applied to multi-layer perceptrons trained using standard backpropagation. Using multi-layer perceptrons for both the larger and smaller models, oracle learning obtains a 15.16% average decrease in error over direct training while retaining 99.64% of the initial oracle accuracy on automatic spoken digit recognition with networks on average only 7% of the original size. For optical character recognition, oracle learning results in neural networks 6% of the original size that yield a 11.40% average decrease in error over direct training while maintaining 98.95% of the initial oracle accuracy. Analysis of the results suggest oracle learning is especially appropriate when either the size of the final model is relatively small or when the amount of available labeled data is small.