A spectral algorithm for learning Hidden Markov Models

Authors:
Daniel Hsu;Sham M. Kakade;Tong Zhang
Affiliations:
Rutgers University, Piscataway, NJ 08854, United States;University of Pennsylvania, Philadelphia, PA 19104, United States;Rutgers University, Piscataway, NJ 08854, United States
Venue:
Journal of Computer and System Sciences
Year:
2012

Citing 11
Cited 5

System identification: theory for the user

System identification: theory for the user
Elements of information theory

Elements of information theory
On the learnability of discrete distributions

STOC '94 Proceedings of the twenty-sixth annual ACM symposium on Theory of computing
A Spectral Algorithm for Learning Mixtures of Distributions

FOCS '02 Proceedings of the 43rd Symposium on Foundations of Computer Science
On the Learnability of Hidden Markov Models

ICGI '02 Proceedings of the 6th International Colloquium on Grammatical Inference: Algorithms and Applications
Learning Mixtures of Gaussians

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Observable Operator Models for Discrete Stochastic Time Series

Neural Computation
A Probabilistic Analysis of EM for Mixtures of Separated, Spherical Gaussians

The Journal of Machine Learning Research
Isotropic PCA and Affine-Invariant Clustering

FOCS '08 Proceedings of the 2008 49th Annual IEEE Symposium on Foundations of Computer Science
The value of observation for monitoring dynamic systems

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Realizations by stochastic finite automata

Journal of Computer and System Sciences

A lower bound for learning distributions generated by probabilistic automata

ALT'10 Proceedings of the 21st international conference on Algorithmic learning theory
Playlist prediction via metric embedding

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Learning mixtures of spherical gaussians: moment methods and spectral decompositions

Proceedings of the 4th conference on Innovations in Theoretical Computer Science
Learning probabilistic automata: A study in state distinguishability

Theoretical Computer Science
Joint training of non-negative Tucker decomposition and discrete density hidden Markov models

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

Hidden Markov Models (HMMs) are one of the most fundamental and widely used statistical tools for modeling discrete time series. In general, learning HMMs from data is computationally hard (under cryptographic assumptions), and practitioners typically resort to search heuristics which suffer from the usual local optima issues. We prove that under a natural separation condition (bounds on the smallest singular value of the HMM parameters), there is an efficient and provably correct algorithm for learning HMMs. The sample complexity of the algorithm does not explicitly depend on the number of distinct (discrete) observations-it implicitly depends on this quantity through spectral properties of the underlying HMM. This makes the algorithm particularly applicable to settings with a large number of observations, such as those in natural language processing where the space of observation is sometimes the words in a language. The algorithm is also simple, employing only a singular value decomposition and matrix multiplications.