A hybrid approach to adaptive statistical language modeling

Authors:
Ronald Rosenfeld
Affiliations:
Carnegie Mellon University, Pittsburgh, PA
Venue:
HLT '94 Proceedings of the workshop on Human Language Technology
Year:
1994

Citing 6
Cited 6

Speech recognition and the frequency of recently used words: a modified Markov model for natural language

COLING '88 Proceedings of the 12th conference on Computational linguistics - Volume 1
Improvements in stochastic language modeling

HLT '91 Proceedings of the workshop on Speech and Natural Language
An overview of the SPHINX-II speech recognition system

HLT '93 Proceedings of the workshop on Human Language Technology
Adaptive language modeling using the maximum entropy principle

HLT '93 Proceedings of the workshop on Human Language Technology
The hub and spoke paradigm for CSR evaluation

HLT '94 Proceedings of the workshop on Human Language Technology
1993 benchmark tests for the ARPA spoken language program

HLT '94 Proceedings of the workshop on Human Language Technology

Improving language models by clustering training sentences

ANLC '94 Proceedings of the fourth conference on Applied natural language processing
Quantifying lexical influence: giving direction to context

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Word-sense disambiguation using decomposable models

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Dynamic nonlocal language modeling via hierarchical topic-based adaptation

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
1993 benchmark tests for the ARPA spoken language program

HLT '94 Proceedings of the workshop on Human Language Technology
Language modeling with sentence-level mixtures

HLT '94 Proceedings of the workshop on Human Language Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe our latest attempt at adaptive language modeling. At the heart of our approach is a Maximum Entropy (ME) model, which incorporates many knowledge sources in a consistent manner. The other components are a selective unigram cache, a conditional bigram cache, and a conventional static trigram. We describe the knowledge sources used to build such a model with ARPA's official WSJ corpus, and report on perplexity and word error rate results obtained with it. Then, three different adaptation paradigms are discussed, and an additional experiment, based on AP wire data, is used to compare them.