Exploiting local structure in Boltzmann machines
Neurocomputing
A better way to learn features: technical perspective
Communications of the ACM
Improved learning of Gaussian-Bernoulli restricted Boltzmann machines
ICANN'11 Proceedings of the 21th international conference on Artificial neural networks - Volume Part I
In All Likelihood, Deep Belief Is Not Enough
The Journal of Machine Learning Research
Learning deep belief networks from non-stationary streams
ICANN'12 Proceedings of the 22nd international conference on Artificial Neural Networks and Machine Learning - Volume Part II
Tikhonov-Type regularization for restricted boltzmann machines
ICANN'12 Proceedings of the 22nd international conference on Artificial Neural Networks and Machine Learning - Volume Part I
Expert Systems with Applications: An International Journal
Hi-index | 0.02 |
Building intelligent systems that are capable of extracting high-level representations from high-dimensional sensory data lies at the core of solving many AI related tasks, including object recognition, speech perception, and language understanding. Theoretical and biological arguments strongly suggest that building such systems requires models with deep architectures that involve many layers of nonlinear processing. The aim of the thesis is to demonstrate that deep generative models that contain many layers of latent variables and millions of parameters can be learned efficiently, and that the learned high-level feature representations can be successfully applied in a wide spectrum of application domains, including visual object recognition, information retrieval, and classification and regression tasks. In addition, similar methods can be used for nonlinear dimensionality reduction. The first part of the thesis focuses on analysis and applications of probabilistic generative models called Deep Belief Networks. We show that these deep hierarchical models can learn useful feature representations from a large supply of unlabeled sensory inputs. The learned high-level representations capture a lot of structure in the input data, which is useful for subsequent problem-specific tasks, such as classification, regression or information retrieval, even though these tasks are unknown when the generative model is being trained. In the second part of the thesis, we introduce a new learning algorithm for a different type of hierarchical probabilistic model, which we call a Deep Boltzmann Machine. Like Deep Belief Networks, Deep Boltzmann Machines have the potential of learning internal representations that become increasingly complex at higher layers, which is a promising way of solving object and speech recognition problems. Unlike Deep Belief Networks and many existing models with deep architectures, the approximate inference procedure, in addition to a fast bottom-up pass, can incorporate top-down feedback. This allows Deep Boltzmann Machines to better propagate uncertainty about ambiguous inputs.