Learning and robust learning of product distributions
COLT '93 Proceedings of the sixth annual conference on Computational learning theory
Learning probabilistic automata with variable memory length
COLT '94 Proceedings of the seventh annual conference on Computational learning theory
Inference and minimization of hidden Markov chains
COLT '94 Proceedings of the seventh annual conference on Computational learning theory
The minimum L-complexity algorithm and its applications to learning non-parametric rules
COLT '94 Proceedings of the seventh annual conference on Computational learning theory
On the learnability of discrete distributions
STOC '94 Proceedings of the twenty-sixth annual ACM symposium on Theory of computing
On the learnability and usage of acyclic probabilistic finite automata
COLT '95 Proceedings of the eighth annual conference on Computational learning theory
Learning to model sequences generated by switching distributions
COLT '95 Proceedings of the eighth annual conference on Computational learning theory
On learning bounded-width branching programs
COLT '95 Proceedings of the eighth annual conference on Computational learning theory
Learning Markov chains with variable memory length from noisy output
COLT '97 Proceedings of the tenth annual conference on Computational learning theory
Modeling protein families using probabilistic suffix trees
RECOMB '99 Proceedings of the third annual international conference on Computational molecular biology
The Hierarchical Hidden Markov Model: Analysis and Applications
Machine Learning
Improved bounds on the sample complexity of learning
SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
IEEE Transactions on Pattern Analysis and Machine Intelligence
The consensus string problem and the complexity of comparing hidden Markov models
Journal of Computer and System Sciences - Computational biology 2002
Hidden Markov Models with Patterns and Their Application to Integrated Circuit Testing
ECML '00 Proceedings of the 11th European Conference on Machine Learning
Complexity of Comparing Hidden Markov Models
ISAAC '01 Proceedings of the 12th International Symposium on Algorithms and Computation
On-Line Estimation of Hidden Markov Model Parameters
DS '00 Proceedings of the Third International Conference on Discovery Science
Notes on Learning Probabilistic Automata
DCC '00 Proceedings of the Conference on Data Compression
Part-of-speech tagging using a Variable Memory Markov model
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
PAC-learnability of Probabilistic Deterministic Finite State Automata
The Journal of Machine Learning Research
Learning nonsingular phylogenies and hidden Markov models
Proceedings of the thirty-seventh annual ACM symposium on Theory of computing
Probabilistic Finite-State Machines-Part II
IEEE Transactions on Pattern Analysis and Machine Intelligence
Grammatical Inference in Bioinformatics
IEEE Transactions on Pattern Analysis and Machine Intelligence
ICML '06 Proceedings of the 23rd international conference on Machine learning
Learning Factor Graphs in Polynomial Time and Sample Complexity
The Journal of Machine Learning Research
PAC-learnability of probabilistic deterministic finite state automata in terms of variation distance
Theoretical Computer Science
On Rational Stochastic Languages
Fundamenta Informaticae
Efficient Pruning of Probabilistic Automata
SSPR & SPR '08 Proceedings of the 2008 Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
Journal of Artificial Intelligence Research
On prediction using variable order Markov models
Journal of Artificial Intelligence Research
A bibliographical study of grammatical inference
Pattern Recognition
Modeling a student's behavior in a tutorial-like system using learning automata
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Viterbi training for PCFGs: hardness results and competitiveness of uniform initialization
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Efficient, correct, unsupervised learning of context-sensitive languages
CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
A lower bound for learning distributions generated by probabilistic automata
ALT'10 Proceedings of the 21st international conference on Algorithmic learning theory
Identification in the limit of systematic-noisy languages
ICGI'06 Proceedings of the 8th international conference on Grammatical Inference: algorithms and applications
Large scale inference of deterministic transductions: tenjinno problem 1
ICGI'06 Proceedings of the 8th international conference on Grammatical Inference: algorithms and applications
On learning finite-state quantum sources
Quantum Information & Computation
On Rational Stochastic Languages
Fundamenta Informaticae
Learning probabilistic automata: A study in state distinguishability
Theoretical Computer Science
LTL model checking of interval markov chains
TACAS'13 Proceedings of the 19th international conference on Tools and Algorithms for the Construction and Analysis of Systems
Hi-index | 0.00 |
We introduce a rigorous performance criterion for training algorithms for probabilistic automata (PAs) and hidden Markov models (HMMs), used extensively for speech recognition, and analyze the complexity of the training problem as a computational problem. The PA training problem is the problem of approximating an arbitrary, unknown source distribution by distributions generated by a PA. We investigate the following question about this important, well-studied problem: Does there exist an efficient training algorithm such that the trained PAs provably converge to a model close to an optimum one with high confidence, after only a feasibly small set of training data? We model this problem in the framework of computational learning theory and analyze the sample as well as computational complexity. We show that the number of examples required for training PAs is moderate—except for some log factors the number of examples is linear in the number of transition probabilities to be trained and a low-degree polynomial in the example length and parameters quantifying the accuracy and confidence. Computationally, however, training PAs is quite demanding: Fixed state size PAs are trainable in time polynomial in the accuracy and confidence parameters and example length, but not in the alphabet size unless RP = NP. The latter result is shown via a strong non-approximability result for the single string maximum likelihood model probem for 2-state PAs, which is of independent interest.