Quickly generating representative samples from an rbm-derived process

Authors:
Olivier Breuleux;Yoshua Bengio;Pascal Vincent
Affiliations:
-;-;-
Venue:
Neural Computation
Year:
2011

Citing 7
Cited 2

Training products of experts by minimizing contrastive divergence

Neural Computation
A fast learning algorithm for deep belief nets

Neural Computation
Training restricted Boltzmann machines using approximations to the likelihood gradient

Proceedings of the 25th international conference on Machine learning
Using fast weights to improve persistent contrastive divergence

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Herding dynamical weights to learn

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Justifying and generalizing contrastive divergence

Neural Computation
Herding dynamic weights for partially observed random field models

UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence

On the expressive power of deep architectures

ALT'11 Proceedings of the 22nd international conference on Algorithmic learning theory
The flip-the-state transition operator for restricted Boltzmann machines

Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Two recently proposed learning algorithms, herding and fast persistent contrastive divergence (FPCD), share the following interesting characteristic: they exploit changes in the model parameters while sampling in order to escape modes and mix better during the sampling process that is part of the learning algorithm. We justify such approaches as ways to escape modes while keeping approximately the same asymptotic distribution of the Markov chain. In that spirit, we extend FPCD using an idea borrowed from Herding in order to obtain a pure sampling algorithm, which we call the rates-FPCD sampler. Interestingly, this sampler can improve the model as we collect more samples, since it optimizes a lower bound on the log likelihood of the training data. We provide empirical evidence that this new algorithm displays substantially better and more robust mixing than Gibbs sampling.