A neural probabilistic language model
The Journal of Machine Learning Research
Continuous space language models
Computer Speech and Language
Continuous space language models for statistical machine translation
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Moses: open source toolkit for statistical machine translation
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Efficient handling of N-gram language models for statistical machine translation
StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Intelligent selection of language model training data
ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Training continuous space language models: some practical issues
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
KenLM: faster and smaller language model queries
WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Efficient subsampling for training complex language models
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Hi-index | 0.00 |
Language models play an important role in large vocabulary speech recognition and statistical machine translation systems. The dominant approach since several decades are back-off language models. Some years ago, there was a clear tendency to build huge language models trained on hundreds of billions of words. Lately, this tendency has changed and recent works concentrate on data selection. Continuous space methods are a very competitive approach, but they have a high computational complexity and are not yet in widespread use. This paper presents an experimental comparison of all these approaches on a large statistical machine translation task. We also describe an open-source implementation to train and use continuous space language models (CSLM) for such large tasks. We describe an efficient implementation of the CSLM using graphical processing units from Nvidia. By these means, we are able to train an CSLM on more than 500 million words in 20 hours. This CSLM provides an improvement of up to 1.8 BLEU points with respect to the best back-off language model that we were able to build.