Efficient Pruning of Probabilistic Automata

  • Authors:
  • Franck Thollard;Baptiste Jeudy

  • Affiliations:
  • Laboratoire Hubert Curien UMR CNR 5516, Université de Lyon, Université Jean-Monnet,;Laboratoire Hubert Curien UMR CNR 5516, Université de Lyon, Université Jean-Monnet,

  • Venue:
  • SSPR & SPR '08 Proceedings of the 2008 Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Applications of probabilistic grammatical inference are limited due to time and space consuming constraints. In statistical language modeling, for example, large corpora are now available and lead to managing automata with millions of states. We propose in this article a method for pruning automata (when restricted to tree based structures) which is not only efficient (sub-quadratic) but that allows to dramatically reduce the size of the automaton with a small impact on the underlying distribution. Results are evaluated on a language modeling task.