Distribution-based pruning of backoff language models

Authors:
Jianfeng Gao;Kai-Fu Lee
Affiliations:
Microsoft Research China, China;Microsoft Research China, China
Venue:
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Year:
2000

Citing 6
Cited 6

Self-organized language modeling for speech recognition

Readings in speech recognition
Foundations of statistical natural language processing

Foundations of statistical natural language processing
A hidden Markov model information retrieval system

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Navigating the Information Superhighway Using Spoken Language Interfaces

IEEE Expert: Intelligent Systems and Their Applications
Distribution of content words and phrases in text and language modelling

Natural Language Engineering

Improving language model size reduction using better pruning criteria

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Chinese Word Segmentation and Named Entity Recognition: A Pragmatic Approach

Computational Linguistics
Adaptive Chinese word segmentation

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Discriminative pruning of language models for Chinese word segmentation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Text Entry Systems: Mobility, Accessibility, Universality

Text Entry Systems: Mobility, Accessibility, Universality
N-gram weighting: reducing training data mismatch in cross-domain language model estimation

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a distribution-based pruning of n-gram backoff language models. Instead of the conventional approach of pruning n-grams that are infrequent in training data, we prune n-grams that are likely to be infrequent in a new document. Our method is based on the n-gram distribution i.e. the probability that an n-gram occurs in a new document. Experimental results show that our method performed 7--9% (word perplexity reduction) better than conventional cutoff methods.