Progressive-search algorithms for large-vocabulary speech recognition
HLT '93 Proceedings of the workshop on Human Language Technology
Coarse-to-fine syntactic machine translation using language projections
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Manual and automatic evaluation of machine translation between European languages
StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
Less is more: significance-based N-gram selection for smaller, better language models
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
An empirical investigation of discounting in cross-domain language models
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
The latent words language model
Computer Speech and Language
Hi-index | 0.00 |
Kneser-Ney (1995) smoothing and its variants are generally recognized as having the best perplexity of any known method for estimating N-gram language models. Kneser-Ney smoothing, however, requires nonstandard N-gram counts for the lower-order models used to smooth the highest-order model. For some applications, this makes Kneser-Ney smoothing inappropriate or inconvenient. In this paper, we introduce a new smoothing method based on ordinary counts that outperforms all of the previous ordinary-count methods we have tested, with the new method eliminating most of the gap between Kneser-Ney and those methods.