Factorial Hidden Markov Models
Machine Learning - Special issue on learning with probabilistic representations
Neural Computation
Diffusion of context and credit information in Markovian models
Journal of Artificial Intelligence Research
Clifford support vector machines for classification, regression, and recurrence
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
Many domains of machine learning involve discovering dependencies and structure over time. In the most complex of domains, long-term temporal dependencies are present. Neural network models suchas lstm have been developed to deal with long-term dependencies, but the continuous nature of neural networks is not well suited to discrete symbol processing tasks. Further, the mathematical underpinnings of neural networks are unclear, and gradient descent learning of recurrent neural networks seems particularly susceptible to local optima. We introduce a novel architecture for discovering dependencies in time. The architecture is formed by combining two variants of a hidden Markov model (hmm) - the factorial hmm and the input-output hmm - and adding a further strong constraint that requires the model to behave as a latch-and-store memory (the same constraint exploited in lstm). This model, called an miofhmm, can learn structure that other variants of the hmm cannot, and can generalize better than lstm on test sequences that have different statistical properties (different lengths, different types of noise) than training sequences. However, the miofhmm is slower to train and is more susceptible to local optima than LSTM.