Enhanced gradient for training restricted boltzmann machines

Authors:
KyungHyun Cho;Tapani Raiko;Alexander Ilin
Affiliations:
-;-;-
Venue:
Neural Computation
Year:
2013

Citing 9
Cited 0

Information processing in dynamical systems: foundations of harmony theory

Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
Training products of experts by minimizing contrastive divergence

Neural Computation
Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Justifying and generalizing contrastive divergence

Neural Computation
Learning Deep Architectures for AI

Foundations and Trends® in Machine Learning
Why Does Unsupervised Pre-training Help Deep Learning?

The Journal of Machine Learning Research
Empirical analysis of the divergence of Gibbs sampling based learning algorithms for restricted Boltzmann machines

ICANN'10 Proceedings of the 20th international conference on Artificial neural networks: Part III
Improved learning of Gaussian-Bernoulli restricted Boltzmann machines

ICANN'11 Proceedings of the 21th international conference on Artificial neural networks - Volume Part I
Random search for hyper-parameter optimization

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Restricted Boltzmann machines RBMs are often used as building blocks in greedy learning of deep networks. However, training this simple model can be laborious. Traditional learning algorithms often converge only with the right choice of metaparameters that specify, for example, learning rate scheduling and the scale of the initial weights. They are also sensitive to specific data representation. An equivalent RBM can be obtained by flipping some bits and changing the weights and biases accordingly, but traditional learning rules are not invariant to such transformations. Without careful tuning of these training settings, traditional algorithms can easily get stuck or even diverge. In this letter, we present an enhanced gradient that is derived to be invariant to bit-flipping transformations. We experimentally show that the enhanced gradient yields more stable training of RBMs both when used with a fixed learning rate and an adaptive one.