Semirings, automata, languages
Semirings, automata, languages
Rational series and their languages
Rational series and their languages
Class-based n-gram models of natural language
Computational Linguistics
Regular models of phonological rule systems
Computational Linguistics - Special issue on computational phonology
Automata: Theoretic Aspects of Formal Power Series
Automata: Theoretic Aspects of Formal Power Series
Semiring frameworks and algorithms for shortest-distance problems
Journal of Automata, Languages and Combinatorics
Finite-state transducers in language and speech processing
Computational Linguistics
An efficient compiler for weighted rewrite rules
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
A weighted finite state transducer translation template model for statistical machine translation
Natural Language Engineering
Discriminative language modeling with conditional random fields and the perceptron algorithm
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
FSA: an efficient and flexible C++ toolkit for finite state automata using on-demand computation
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Robust named entity extraction from large spoken archives
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Discriminative n-gram language modeling
Computer Speech and Language
NAACL-HLT-Dialog '07 Proceedings of the Workshop on Bridging the Gap: Academic and Industrial Research in Dialog Technologies
Large-Scale Statistical Machine Translation with Weighted Finite State Transducers
Proceedings of the 2009 conference on Finite-State Methods and Natural Language Processing: Post-proceedings of the 7th International Workshop FSMNLP 2008
A Memory-efficient ε-Removal Algorithm for Weighted Acyclic Finite-State Automata
Proceedings of the 2009 conference on Finite-State Methods and Natural Language Processing: Post-proceedings of the 7th International Workshop FSMNLP 2008
Robust understanding in multimodal interfaces
Computational Linguistics
Hierarchical phrase-based translation with weighted finite state transducers
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Arabic diacritization using weighted finite-state transducers
Semitic '05 Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages
StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Statistical lattice-based spoken document retrieval
ACM Transactions on Information Systems (TOIS)
Evaluation for WFST-based dialog management
Proceedings of the 3rd International Universal Communication Symposium
Variational decoding for statistical machine translation
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Applying Weighted Finite State Machines to Protocol Performance Analysis
SEEFM '09 Proceedings of the 2009 Fourth South-East European Workshop on Formal Methods
ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Fluency constraints for minimum Bayes-risk decoding of statistical machine translation lattices
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
fsm2 - a scripting language interpreter for manipulating weighted finite-state automata
FSMNLP'09 Proceedings of the 8th international conference on Finite-state methods and natural language processing
Hierarchical phrase-based translation with weighted finite-state transducers and shallow-n grammars
Computational Linguistics
Finite-state models for speech-based search on mobile devices
Natural Language Engineering
Lexicographic semirings for exact automata encoding of sequence models
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Description of the JHU system combination scheme for WMT 2011
WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
A general weighted grammar library
CIAA'04 Proceedings of the 9th international conference on Implementation and Application of Automata
Open source WFST tools for LVCSR cascade development
FSMNLP '11 Proceedings of the 9th International Workshop on Finite State Methods and Natural Language Processing
Measuring the confusability of pronunciations in speech recognition
FSMNLP '11 Proceedings of the 9th International Workshop on Finite State Methods and Natural Language Processing
A monotonic statistical machine translation approach to speaking style transformation
Computer Speech and Language
Thinking outside the box for natural language processing
CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Unsupervised learning on an approximate corpus
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Implicitly intersecting weighted automata using dual decomposition
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Deciphering foreign language by combining language models and context vectors
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Hi-index | 0.00 |
Recent text and speech processing applications such as speech mining raise new and more general problems related to the construction of language models. We present and describe in detail several new and efficient algorithms to address these more general problems and report experimental results demonstrating their usefulness. We give an algorithm for computing efficiently the expected counts of any sequence in a word lattice output by a speech recognizer or any arbitrary weighted automaton; describe a new technique for creating exact representations of n-gram language models by weighted automata whose size is practical for offline use even for a vocabulary size of about 500,000 words and an n-gram order n = 6; and present a simple and more general technique for constructing class-based language models that allows each class to represent an arbitrary weighted automaton. An efficient implementation of our algorithms and techniques has been incorporated in a general software library for language modeling, the GRM Library, that includes many other text and grammar processing functionalities.