A systematic comparison of various statistical alignment models
Computational Linguistics
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Minimum error rate training in statistical machine translation
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Tree-based state tying for high accuracy acoustic modelling
HLT '94 Proceedings of the workshop on Human Language Technology
Maximum entropy based restoration of Arabic diacritics
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Word-sense disambiguation for machine translation
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Arabic diacritization through full morphological tagging
NAACL-Short '07 Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
Context-dependent alignment models for statistical machine translation
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Arabic diacritization using weighted finite-state transducers
Semitic '05 Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages
Automatic diacritization of Arabic for acoustic modeling in speech recognition
Semitic '04 Proceedings of the Workshop on Computational Approaches to Arabic Script-based Languages
Rich source-side context for statistical machine translation
StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Hi-index | 0.00 |
We present a method for incorporating arbitrary context-informed word attributes into statistical machine translation by clustering attribute-qualified source words, and smoothing their word translation probabilities using binary decision trees. We describe two ways in which the decision trees are used in machine translation: by using the attribute-qualified source word clusters directly, or by using attribute-dependent lexical translation probabilities that are obtained from the trees, as a lexical smoothing feature in the decoder model. We present experiments using Arabic-to-English newswire data, and using Arabic diacritics and part-of-speech as source word attributes, and show that the proposed method improves on a state-of-the-art translation system.