Improvements in stochastic language modeling

Authors:
Ronald Rosenfeld;Xuedong Huang
Affiliations:
Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA
Venue:
HLT '91 Proceedings of the workshop on Speech and Natural Language
Year:
1992

Citing 4
Cited 12

A Cache-Based Natural Language Model for Speech Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
Probabilistic models of short and long distance word dependencies in running text

HLT '89 Proceedings of the workshop on Speech and Natural Language
Self-organized language modeling for speech recognition

Readings in speech recognition
A dynamic language model for speech recognition

HLT '91 Proceedings of the workshop on Speech and Natural Language

Distribution of content words and phrases in text and language modelling

Natural Language Engineering
Similarity-based estimation of word cooccurrence probabilities

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
A class-based approach to lexical discovery

ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics
Modeling topic coherence for speech recognition

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Do CFG-based language models need agreement constraints?

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Adaptive language modeling using the maximum entropy principle

HLT '93 Proceedings of the workshop on Human Language Technology
A hybrid approach to adaptive statistical language modeling

HLT '94 Proceedings of the workshop on Human Language Technology
Using Skipping for Sequence-Based Collaborative Filtering

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
A low-order markov model integrating long-distance histories for collaborative recommender systems

Proceedings of the 14th international conference on Intelligent user interfaces
Less is more: significance-based N-gram selection for smaller, better language models

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Trigger-based language models: a maximum entropy approach

ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
On the dynamic adaptation of language models based on dialogue information

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe two attempt to improve our stochastic language models. In the first, we identify a systematic overestimation in the traditional backoff model, and use statistical reasoning to correct it. Our modification results in up to 6% reduction in the perplexity of various tasks. Although the improvement is modest, it is achieved with hardly any increase in the complexity of the model. Both analysis and empirical data suggest that the modification is most suitable when training data is sparse.In the second attempt, we propose a new type of adaptive language model. Existing adaptive models use a dynamic cache, based on the history of the document seen up to that point. But another source of information in the history, within-document word sequence correlations, has not yet been tapped. We describe a model that attempts to capture this information, using a framework where one word sequence triggers another, causing its estimated probability to be raised. We discuss various issues in the design of such a model, and describe our first attempt at building one. Our preliminary results include a perplexity reduction of between 10% and 32%, depending on the test set.