Efficient Pruning of Probabilistic Automata

Authors:
Franck Thollard;Baptiste Jeudy
Affiliations:
Laboratoire Hubert Curien UMR CNR 5516, Université de Lyon, Université Jean-Monnet,;Laboratoire Hubert Curien UMR CNR 5516, Université de Lyon, Université Jean-Monnet,
Venue:
SSPR & SPR '08 Proceedings of the 2008 Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
Year:
2008

Citing 15
Cited 0

Elements of information theory

Elements of information theory
On the Computational Complexity of Approximating Distributions by Probabilistic Automata

Machine Learning - Computational learning theory
On the learnability and usage of acyclic probabilistic finite automata

COLT '95 Proceedings of the eighth annual conference on Computational learning theory
Using Symbol Clustering to Improve Probabilistic Automaton Inference

ICGI '98 Proceedings of the 4th International Colloquium on Grammatical Inference
Stochastic Grammatical Inference with Multinomial Tests

ICGI '02 Proceedings of the 6th International Colloquium on Grammatical Inference: Algorithms and Applications
Shallow Parsing Using Probabilistic Grammatical Inference

ICGI '02 Proceedings of the 6th International Colloquium on Grammatical Inference: Algorithms and Applications
Probabilistic DFA Inference using Kullback-Leibler Divergence and Minimality

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Learning Stochastic Regular Grammars by Means of a State Merging Method

ICGI '94 Proceedings of the Second International Colloquium on Grammatical Inference and Applications
Improving Probabilistic Grammatical Inference Core Algorithms with Post-processing Techniques

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
A study of smoothing methods for language models applied to information retrieval

ACM Transactions on Information Systems (TOIS)
PAC-learnability of Probabilistic Deterministic Finite State Automata

The Journal of Machine Learning Research
Probabilistic Finite-State Machines-Part II

IEEE Transactions on Pattern Analysis and Machine Intelligence
Immediate-head parsing for language models

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Machine Translation with Inferred Stochastic Finite-State Transducers

Computational Linguistics
Learning Partially Observable Markov Models from First Passage Times

ECML '07 Proceedings of the 18th European conference on Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Applications of probabilistic grammatical inference are limited due to time and space consuming constraints. In statistical language modeling, for example, large corpora are now available and lead to managing automata with millions of states. We propose in this article a method for pruning automata (when restricted to tree based structures) which is not only efficient (sub-quadratic) but that allows to dramatically reduce the size of the automaton with a small impact on the underlying distribution. Results are evaluated on a language modeling task.