On Growing and Pruning Kneser–Ney Smoothed -Gram Models

Authors:
V. Siivola;T. Hirsimaki;S. Virpioja
Affiliations:
Helsinki Univ. of Technol., Helsinki;-;-
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2007

Citing 0
Cited 6

Position Models and Language Modeling

SSPR & SPR '08 Proceedings of the 2008 Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
Web augmentation of language models for continuous speech recognition of SMS text messages

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Analysing recognition errors in unlimited-vocabulary speech recognition

NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Predicting reaction times in word recognition by unsupervised learning of morphology

ICANN'11 Proceedings of the 21th international conference on Artificial neural networks - Volume Part I
Unsupervised vocabulary adaptation for morph-based language models

WLM '12 Proceedings of the NAACL-HLT 2012 Workshop: Will We Ever Really Replace the N-gram Model? On the Future of Language Modeling for HLT
Converting Neural Network Language Models into Back-off Language Models for Efficient Decoding in Automatic Speech Recognition

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

N-gram models are the most widely used language models in large vocabulary continuous speech recognition. Since the size of the model grows rapidly with respect to the model order and available training data, many methods have been proposed for pruning the least relevant -grams from the model. However, correct smoothing of the N-gram probability distributions is important and performance may degrade significantly if pruning conflicts with smoothing. In this paper, we show that some of the commonly used pruning methods do not take into account how removing an -gram should modify the backoff distributions in the state-of-the-art Kneser-Ney smoothing. To solve this problem, we present two new algorithms: one for pruning Kneser-Ney smoothed models, and one for growing them incrementally. Experiments on Finnish and English text corpora show that the proposed pruning algorithm provides considerable improvements over previous pruning algorithms on Kneser-Ney smoothed models and is also better than the baseline entropy pruned Good-Turing smoothed models. The models created by the growing algorithm provide a good starting point for our pruning algorithm, leading to further improvements. The improvements in the Finnish speech recognition over the other Kneser-Ney smoothed models are statistically significant, as well.