Adaptive language modeling using the maximum entropy principle

Authors:
Raymond Lau;Ronald Rosenfeld;Salim Roukos
Affiliations:
IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, NY;IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, NY;IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, NY
Venue:
HLT '93 Proceedings of the workshop on Human Language Technology
Year:
1993

Citing 2
Cited 18

A dynamic language model for speech recognition

HLT '91 Proceedings of the workshop on Speech and Natural Language
Improvements in stochastic language modeling

HLT '91 Proceedings of the workshop on Speech and Natural Language

A Review of Statistical Language Processing Techniques

Artificial Intelligence Review
Learning to Parse Natural Language with Maximum Entropy Models

Machine Learning - Special issue on natural language learning
Statistical Models for Text Segmentation

Machine Learning - Special issue on natural language learning
Review of "Statistical language learning" by Eugene Charniak. The MIT Press 1993.

Computational Linguistics
Word-sense disambiguation using decomposable models

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Why inverse document frequency?

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
A hybrid approach to adaptive statistical language modeling

HLT '94 Proceedings of the workshop on Human Language Technology
Speech recognition using a stochastic language model integrating local and global constraints

HLT '94 Proceedings of the workshop on Human Language Technology
Combining data-driven systems for improving Named Entity Recognition

Data & Knowledge Engineering
An Extension of Iterative Scaling for Decision and Data Aggregation in Ensemble Classification

Journal of VLSI Signal Processing Systems
Multilingual Question Classification based on surface text features

Proceedings of the 2005 conference on Artificial Intelligence Research and Development
First- and second-order expectation semirings with applications to minimum-risk training on translation forests

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Effects of automated transcription quality on non-native speakers' comprehension in real-time computer-mediated communication

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
A maximum entropy approach for spoken Chinese understanding

CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
Training continuous space language models: some practical issues

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
A maximum entropy model based answer extraction for chinese question answering

FSKD'06 Proceedings of the Third international conference on Fuzzy Systems and Knowledge Discovery
A study of the influence of pos tagging on WSD

TSD'06 Proceedings of the 9th international conference on Text, Speech and Dialogue
Measuring the influence of long range dependencies with neural network language models

WLM '12 Proceedings of the NAACL-HLT 2012 Workshop: Will We Ever Really Replace the N-gram Model? On the Future of Language Modeling for HLT

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe our ongoing efforts at adaptive statistical language modeling. Central to our approach is the Maximum Entropy (ME) Principle, allowing us to combine evidence from multiple sources, such as long-distance triggers and conventional short-distance trigrams. Given consistent statistical evidence, a unique ME solution is guaranteed to exist, and an iterative algorithm exists which is guaranteed to converge to it. Among the advantages of this approach are its simplicity, its generality, and its incremental nature. Among its disadvantages are its computational requirements. We describe a succession of ME models, culminating in our current Maximum Likelihood/Maximum Entropy (ML/ME) model. Preliminary results with the latter show a 27% perplexity reduction as compared to a conventional trigram model.