A Discrete Probabilistic Memory Model for Discovering Dependencies in Time

Authors:
Sepp Hochreiter;Michael Mozer
Affiliations:
-;-
Venue:
ICANN '01 Proceedings of the International Conference on Artificial Neural Networks
Year:
2001

Citing 3
Cited 1

Factorial Hidden Markov Models

Machine Learning - Special issue on learning with probabilistic representations
Long Short-Term Memory

Neural Computation
Diffusion of context and credit information in Markovian models

Journal of Artificial Intelligence Research

Clifford support vector machines for classification, regression, and recurrence

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many domains of machine learning involve discovering dependencies and structure over time. In the most complex of domains, long-term temporal dependencies are present. Neural network models suchas lstm have been developed to deal with long-term dependencies, but the continuous nature of neural networks is not well suited to discrete symbol processing tasks. Further, the mathematical underpinnings of neural networks are unclear, and gradient descent learning of recurrent neural networks seems particularly susceptible to local optima. We introduce a novel architecture for discovering dependencies in time. The architecture is formed by combining two variants of a hidden Markov model (hmm) - the factorial hmm and the input-output hmm - and adding a further strong constraint that requires the model to behave as a latch-and-store memory (the same constraint exploited in lstm). This model, called an miofhmm, can learn structure that other variants of the hmm cannot, and can generalize better than lstm on test sequences that have different statistical properties (different lengths, different types of noise) than training sequences. However, the miofhmm is slower to train and is more susceptible to local optima than LSTM.