A hierarchical Bayesian language model based on Pitman-Yor processes
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
N-gram weighting: reducing training data mismatch in cross-domain language model estimation
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
MAP adaptation of stochastic grammars
Computer Speech and Language
Improved smoothing for N-gram language models based on ordinary counts
ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Intelligent selection of language model training data
ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Hi-index | 0.00 |
We investigate the empirical behavior of n-gram discounts within and across domains. When a language model is trained and evaluated on two corpora from exactly the same domain, discounts are roughly constant, matching the assumptions of modified Kneser-Ney LMs. However, when training and test corpora diverge, the empirical discount grows essentially as a linear function of the n-gram count. We adapt a Kneser-Ney language model to incorporate such growing discounts, resulting in perplexity improvements over modified Kneser-Ney and Jelinek-Mercer baselines.