Towards adaptive learning with improved convergence of deep belief networks on graphics processing units

Authors:
Noel Lopes;Bernardete Ribeiro
Affiliations:
-;-
Venue:
Pattern Recognition
Year:
2014

Citing 17
Cited 0

Handbook of Neural Computation

Handbook of Neural Computation
Training products of experts by minimizing contrastive divergence

Neural Computation
Acceleration Techniques for the Backpropagation Algorithm

Proceedings of the EURASIP Workshop 1990 on Neural Networks
An efficient gradient-based learning algorithm applied to neural networks with selective actuation neurons

Neural, Parallel & Scientific Computations
Using GPUs for Machine Learning Algorithms

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
A fast learning algorithm for deep belief nets

Neural Computation
An empirical evaluation of deep architectures on problems with many factors of variation

Proceedings of the 24th international conference on Machine learning
Optimization principles and application performance evaluation of a multithreaded GPU using CUDA

Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Representational power of restricted boltzmann machines and deep belief networks

Neural Computation
Fast support vector machine training and classification on graphics processors

Proceedings of the 25th international conference on Machine learning
Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Large-scale deep unsupervised learning using graphics processors

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Learning Deep Architectures for AI

Foundations and Trends® in Machine Learning
A Large-Scale Architecture for Restricted Boltzmann Machines

FCCM '10 Proceedings of the 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines
Deep belief networks are compact universal approximators

Neural Computation
Understanding throughput-oriented architectures

Communications of the ACM
High-performance reconfigurable hardware architecture for restricted Boltzmann machines

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.01

Visualization

Abstract

In this paper we focus on two complementary approaches to significantly decrease pre-training time of a deep belief network (DBN). First, we propose an adaptive step size technique to enhance the convergence of the contrastive divergence (CD) algorithm, thereby reducing the number of epochs to train the restricted Boltzmann machine (RBM) that supports the DBN infrastructure. Second, we present a highly scalable graphics processing unit (GPU) parallel implementation of the CD-k algorithm, which boosts notably the training speed. Additionally, extensive experiments are conducted on the MNIST and the HHreco databases. The results suggest that the maximum useful depth of a DBN is related to the number and quality of the training samples. Moreover, it was found that the lower-level layer plays a fundamental role for building successful DBN models. Furthermore, the results contradict the pre-conceived idea that all the layers should be pre-trained. Finally, it is shown that by incorporating multiple back-propagation (MBP) layers, the DBNs generalization capability is remarkably improved.