Semirings, automata, languages
Semirings, automata, languages
Minimisation of acyclic deterministic automata in linear time
Theoretical Computer Science - Selected papers of the Combinatorial Pattern Matching School
Statistical methods for speech recognition
Statistical methods for speech recognition
Minimization algorithms for sequential transducers
Theoretical Computer Science
Efficient string matching: an aid to bibliographic search
Communications of the ACM
Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition
Semiring frameworks and algorithms for shortest-distance problems
Journal of Automata, Languages and Combinatorics
On transformations of formal power series
Information and Computation
Finite-state transducers in language and speech processing
Computational Linguistics
Simpler and more general minimization for weighted finite-state automata
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Generalized algorithms for constructing statistical language models
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Microtext: the design of a microprogrammed finite state search machine for full-text retrieval
AFIPS '72 (Fall, part I) Proceedings of the December 5-7, 1972, fall joint computer conference, part I
A Memory-efficient ε-Removal Algorithm for Weighted Acyclic Finite-State Automata
Proceedings of the 2009 conference on Finite-State Methods and Natural Language Processing: Post-proceedings of the 7th International Workshop FSMNLP 2008
On-the-fly techniques for game-based software model checking
TACAS'08/ETAPS'08 Proceedings of the Theory and practice of software, 14th international conference on Tools and algorithms for the construction and analysis of systems
fsm2 - a scripting language interpreter for manipulating weighted finite-state automata
FSMNLP'09 Proceedings of the 8th international conference on Finite-state methods and natural language processing
Hi-index | 0.00 |
Statistical language models are an important tool in natural language processing. They represent prior knowledge about a certain language which is usually gained from a set of samples called a corpus. In this paper, we present a novel way of creating N-gram language models using weighted finite automata. The construction of these models is formalised within the algebra underlying weighted finite automata and expressed in terms of weighted rational languages and transductions. Besides the algebra we make use of five special constant weighted transductions which rely only on the alphabet and the model parameter N. In addition, we discuss efficient implementations of these transductions in terms of virtual constructions.