Elements of information theory
Elements of information theory
On the Computational Complexity of Approximating Distributions by Probabilistic Automata
Machine Learning - Computational learning theory
On the learnability and usage of acyclic probabilistic finite automata
COLT '95 Proceedings of the eighth annual conference on Computational learning theory
Using Symbol Clustering to Improve Probabilistic Automaton Inference
ICGI '98 Proceedings of the 4th International Colloquium on Grammatical Inference
Stochastic Grammatical Inference with Multinomial Tests
ICGI '02 Proceedings of the 6th International Colloquium on Grammatical Inference: Algorithms and Applications
Shallow Parsing Using Probabilistic Grammatical Inference
ICGI '02 Proceedings of the 6th International Colloquium on Grammatical Inference: Algorithms and Applications
Probabilistic DFA Inference using Kullback-Leibler Divergence and Minimality
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Learning Stochastic Regular Grammars by Means of a State Merging Method
ICGI '94 Proceedings of the Second International Colloquium on Grammatical Inference and Applications
Improving Probabilistic Grammatical Inference Core Algorithms with Post-processing Techniques
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
A study of smoothing methods for language models applied to information retrieval
ACM Transactions on Information Systems (TOIS)
PAC-learnability of Probabilistic Deterministic Finite State Automata
The Journal of Machine Learning Research
Probabilistic Finite-State Machines-Part II
IEEE Transactions on Pattern Analysis and Machine Intelligence
Immediate-head parsing for language models
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Machine Translation with Inferred Stochastic Finite-State Transducers
Computational Linguistics
Learning Partially Observable Markov Models from First Passage Times
ECML '07 Proceedings of the 18th European conference on Machine Learning
Hi-index | 0.00 |
Applications of probabilistic grammatical inference are limited due to time and space consuming constraints. In statistical language modeling, for example, large corpora are now available and lead to managing automata with millions of states. We propose in this article a method for pruning automata (when restricted to tree based structures) which is not only efficient (sub-quadratic) but that allows to dramatically reduce the size of the automaton with a small impact on the underlying distribution. Results are evaluated on a language modeling task.