Three heads are better than one
ANLC '94 Proceedings of the fourth conference on Applied natural language processing
Minimum error rate training in statistical machine translation
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Statistical Machine Translation with Scarce Resources Using Morpho-syntactic Information
Computational Linguistics
Multi-engine machine translation with voted language model
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Arabic tokenization, part-of-speech tagging and morphological disambiguation in one fell swoop
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Combination of Arabic preprocessing schemes for statistical machine translation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Improving statistical MT through morphological analysis
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Multi-engine machine translation guided by explicit word matching
ACLdemo '05 Proceedings of the ACL 2005 on Interactive poster and demonstration sessions
Morphological analysis for statistical machine translation
HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
Automatic tagging of Arabic text: from raw text to base phrase chunks
HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
PORTAGE: a phrase-based machine translation system
ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
Combination of Arabic preprocessing schemes for statistical machine translation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Chinese word segmentation and statistical machine translation
ACM Transactions on Speech and Language Processing (TSLP)
HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Context-based Arabic morphological analysis for machine translation
CoNLL '08 Proceedings of the Twelfth Conference on Computational Natural Language Learning
Combination of statistical word alignments based on multiple preprocessing schemes
NAACL-Short '07 Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
Toward using morphology in French-English phrase-based SMT
StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Improving Arabic-Chinese statistical machine translation using English as pivot language
StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Morphology-Based Segmentation Combination for Arabic Mention Detection
ACM Transactions on Asian Language Information Processing (TALIP)
A comparison study of some Arabic root finding algorithms
Journal of the American Society for Information Science and Technology
Arabic Mention Detection: toward better unit of analysis
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Unsupervised search for the optimal segmentation for statistical machine translation
ACLstudent '10 Proceedings of the ACL 2010 Student Research Workshop
A hybrid morpheme-word representation for machine translation of morphologically rich languages
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Nonparametric word segmentation for machine translation
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Improving machine translation of null subjects in Italian and Spanish
EACL '12 Proceedings of the Student Research Workshop at the 13th Conference of the European Chapter of the Association for Computational Linguistics
Unsupervised morphology rivals supervised morphology for Arabic MT
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Hi-index | 0.00 |
Statistical machine translation is quite robust when it comes to the choice of input representation. It only requires consistency between training and testing. As a result, there is a wide range of possible preprocessing choices for data used in statistical machine translation. This is even more so for morphologically rich languages such as Arabic. In this paper, we study the effect of different word-level preprocessing schemes for Arabic on the quality of phrase-based statistical machine translation. We also present and evaluate different methods for combining preprocessing schemes resulting in improved translation quality.