Artificial neural network reduction through oracle learning

Authors:
Joshua E. Menke;Tony R. Martinez
Affiliations:
(Correspd. Tel.: +1 801 422 3027/ Fax: +1 801 422 0169/ E-mail: josh@axon.cs.byu.edu) Computer Science Department, Brigham Young University, Provo, UT, USA;Computer Science Department, Brigham Young University, Provo, UT, USA
Venue:
Intelligent Data Analysis
Year:
2009

Citing 15
Cited 0

Handwritten digit recognition with a back-propagation network

Advances in neural information processing systems 2
Optimal brain damage

Advances in neural information processing systems 2
C4.5: programs for machine learning

C4.5: programs for machine learning
Original Contribution: Improving model selection by nonconvergent methods

Neural Networks
Bagging predictors

Machine Learning
Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Approximate statistical tests for comparing supervised classification learning algorithms

Neural Computation
Text Classification from Labeled and Unlabeled Documents using EM

Machine Learning - Special issue on information retrieval
Analyzing the effectiveness and applicability of co-training

Proceedings of the ninth international conference on Information and knowledge management
Using a Neural Network to Approximate an Ensemble of Classifiers

Neural Processing Letters
Knowledge Acquisition form Examples Vis Multiple Models

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Multitask learning

Multitask learning
Model compression

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Generalization Error Bounds in Semi-supervised Classification Under the Cluster Assumption

The Journal of Machine Learning Research
Large Margin Semi-supervised Learning

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.01

Visualization

Abstract

Often the best model to solve a real-world problem is relatively complex. This paper presents oracle learning, a method using a larger model as an oracle to train a smaller model on unlabeled data in order to obtain (1) a smaller acceptable model and (2) improved results over standard training methods on a similarly sized smaller model. In particular, this paper looks at oracle learning as applied to multi-layer perceptrons trained using standard backpropagation. Using multi-layer perceptrons for both the larger and smaller models, oracle learning obtains a 15.16% average decrease in error over direct training while retaining 99.64% of the initial oracle accuracy on automatic spoken digit recognition with networks on average only 7% of the original size. For optical character recognition, oracle learning results in neural networks 6% of the original size that yield a 11.40% average decrease in error over direct training while maintaining 98.95% of the initial oracle accuracy. Analysis of the results suggest oracle learning is especially appropriate when either the size of the final model is relatively small or when the amount of available labeled data is small.