Class-based n-gram models of natural language
Computational Linguistics
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Using POS information for statistical machine translation into morphologically rich languages
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Discriminative training and maximum entropy models for statistical machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Improved statistical alignment models
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Statistical Machine Translation with Scarce Resources Using Morpho-syntactic Information
Computational Linguistics
Generation of word graphs in statistical machine translation
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Improving statistical machine translation using shallow linguistic knowledge
Computer Speech and Language
Hi-index | 0.00 |
In this paper, we present an empirical study that utilizes morph-syntactical information to improve translation quality. With three kinds of language pairs matched according to morph-syntactical similarity or difference, we investigate the effects of various morpho-syntactical information, such as base form, part-of-speech, and the relative positional information of a word in a statistical machine translation framework. We learn not only translation models but also word-based/class-based language models by manipulating morphological and relative positional information. And we integrate the models into a log-linear model. Experiments on multilingual translations showed that such morphological information as part-of-speech and base form are effective for improving performance in morphologically rich language pairs and that the relative positional features in a word group are useful for reordering the local word orders. Moreover, the use of a class-based n-gram language model improves performance by alleviating the data sparseness problem in a word-based language model.