On the learnability of discrete distributions
STOC '94 Proceedings of the twenty-sixth annual ACM symposium on Theory of computing
On the learnability and usage of acyclic probabilistic finite automata
Journal of Computer and System Sciences - Special issue on the eighth annual workshop on computational learning theory, July 5–8, 1995
Stochastic Grammatical Inference with Multinomial Tests
ICGI '02 Proceedings of the 6th International Colloquium on Grammatical Inference: Algorithms and Applications
Probabilistic DFA Inference using Kullback-Leibler Divergence and Minimality
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Learning Stochastic Regular Grammars by Means of a State Merging Method
ICGI '94 Proceedings of the Second International Colloquium on Grammatical Inference and Applications
Identification of DFA: data-dependent vs data-independent algorithms
ICG! '96 Proceedings of the 3rd International Colloquium on Grammatical Inference: Learning Syntax from Sentences
PAC-learnability of Probabilistic Deterministic Finite State Automata
The Journal of Machine Learning Research
PAC-Learning of markov models with hidden state
ECML'06 Proceedings of the 17th European conference on Machine Learning
PAC-learnability of probabilistic deterministic finite state automata in terms of variation distance
ALT'05 Proceedings of the 16th international conference on Algorithmic Learning Theory
Learnability of probabilistic automata via oracles
ALT'05 Proceedings of the 16th international conference on Algorithmic Learning Theory
Learning rational stochastic languages
COLT'06 Proceedings of the 19th annual conference on Learning Theory
Learning PDFA with asynchronous transitions
ICGI'10 Proceedings of the 10th international colloquium conference on Grammatical inference: theoretical results and applications
A lower bound for learning distributions generated by probabilistic automata
ALT'10 Proceedings of the 21st international conference on Algorithmic learning theory
Learning probabilistic automata: A study in state distinguishability
Theoretical Computer Science
Software model synthesis using satisfiability solvers
Empirical Software Engineering
Hi-index | 0.00 |
We present an improvement of an algorithm due to Clark and Thollard (Journal of Machine Learning Research, 2004) for PAC-learning distributions generated by Probabilistic Deterministic Finite Automata (PDFA). Our algorithm is an attempt to keep the rigorous guarantees of the original one but use sample sizes that are not as astronomical as predicted by the theory. We prove that indeed our algorithm PAC-learns in a stronger sense than the Clark-Thollard. We also perform very preliminary experiments: We show that on a few small targets (8-10 states) it requires only hundreds of examples to identify the target. We also test the algorithm on a web logfile recording about a hundred thousand sessions from an ecommerce site, from which it is able to extract some nontrivial structure in the form of a PDFA with 30-50 states. An additional feature, in fact partly explaining the reduction in sample size, is that our algorithm does not need as input any information about the distinguishability of the target.