A dynamic language model for speech recognition

Authors:
F. Jelinek;B. Merialdo;S. Roukos;M. Strauss
Affiliations:
-;-;-;-
Venue:
HLT '91 Proceedings of the workshop on Speech and Natural Language
Year:
1991

Citing 2
Cited 25

Probabilistic models of short and long distance word dependencies in running text

HLT '89 Proceedings of the workshop on Speech and Natural Language
Speech recognition and the frequency of recently used words: a modified Markov model for natural language

COLING '88 Proceedings of the 12th conference on Computational linguistics - Volume 1

Techniques for automatically correcting words in text

ACM Computing Surveys (CSUR)
Technology-driven design of speech recognition systems

Proceedings of the 1st conference on Designing interactive systems: processes, practices, methods, & techniques
A Review of Statistical Language Processing Techniques

Artificial Intelligence Review
Statistical Models for Text Segmentation

Machine Learning - Special issue on natural language learning
Corrections to "A Cache-Based Language Model for Speech Recognition"

IEEE Transactions on Pattern Analysis and Machine Intelligence
Improving language models by clustering training sentences

ANLC '94 Proceedings of the fourth conference on Applied natural language processing
Distribution of content words and phrases in text and language modelling

Natural Language Engineering
A model of lexical attraction and repulsion

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Modeling topic coherence for speech recognition

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Improvement of a Whole Sentence Maximum Entropy Language Model using grammatical features

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Adaptive language modeling using minimum discriminant estimation

HLT '91 Proceedings of the workshop on Speech and Natural Language
Improvements in stochastic language modeling

HLT '91 Proceedings of the workshop on Speech and Natural Language
Adaptive language modeling using the maximum entropy principle

HLT '93 Proceedings of the workshop on Human Language Technology
Language modeling with sentence-level mixtures

HLT '94 Proceedings of the workshop on Human Language Technology
Cache-based Statistical Language Models of English and Highly Inflected Lithuanian

Informatica
Word Topic Models for Spoken Document Retrieval and Transcription

ACM Transactions on Asian Language Information Processing (TALIP)
Arabic language modeling with finite state transducers

HLT-SRWS '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Student Research Workshop
Shrinking exponential language models

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Adaptation of large vocabulary recognition system parameters

ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Adaptive language modeling using minimum discriminant estimation

ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Trigger-based language models: a maximum entropy approach

ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
Sentiment analysis of customer reviews: balanced versus unbalanced datasets

KES'11 Proceedings of the 15th international conference on Knowledge-based and intelligent information and engineering systems - Volume Part I
A three level cache-based adaptive chinese language model

IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
On the dynamic adaptation of language models based on dialogue information

Expert Systems with Applications: An International Journal
Leveraging relevance cues for language modeling in speech recognition

Information Processing and Management: an International Journal

Quantified Score

Hi-index	0.01

Visualization

Abstract

In the case of a trigram language model, the probability of the next word conditioned on the previous two words is estimated from a large corpus of text. The resulting static trigram language model (STLM) has fixed probabilities that are independent of the document being dictated. To improve the language model (LM), one can adapt the probabilities of the trigram language model to match the current document more closely. The partially dictated document provides significant clues about what words are more likely to be used next. Of many methods that can be used to adapt the LM, we describe in this paper a simple model based on the trigram frequencies estimated from the partially dictated document. We call this model a cache trigram language model (CTLM) since we are caching the recent history of words. We have found that the CTLM reduces the perplexity of a dictated document by 23%. The error rate of a 20,000-word isolated word recognizer decreases by about 5% at the beginning of a document and by about 24% after a few hundred words.