Foundations of statistical natural language processing
Foundations of statistical natural language processing
A systematic comparison of various statistical alignment models
Computational Linguistics
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
HMM-based word alignment in statistical translation
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Intricacies of Collins' Parsing Model
Computational Linguistics
Clause restructuring for statistical machine translation
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Arabic tokenization, part-of-speech tagging and morphological disambiguation in one fell swoop
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Maximum entropy based restoration of Arabic diacritics
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Improving a statistical MT system with automatically learned rewrite patterns
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Algorithms for deterministic incremental dependency parsing
Computational Linguistics
Arabic Natural Language Processing
Arabic Natural Language Processing
HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Moses: open source toolkit for statistical machine translation
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Arabic preprocessing schemes for statistical machine translation
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Cohesive constraints in a beam search phrase-based decoder
NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
SSST '07 Proceedings of the NAACL-HLT 2007/AMTA Workshop on Syntax and Structure in Statistical Translation
Using syntax to improve word alignment precision for syntax-based machine translation
StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Using shallow syntax information to improve word alignment and reordering for SMT
StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
CATiB: the Columbia Arabic Treebank
ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Improved word alignment with statistics and linguistic heuristics
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Improving Arabic dependency parsing with lexical and inflectional morphological features
SPMRL '10 Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages
Chunk-based verb reordering in VSO sentences for Arabic-English statistical machine translation
WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Better Arabic parsing: baselines, evaluations, and analysis
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Clause restructuring for SMT not absolutely helpful
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Fuzzy syntactic reordering for phrase-based statistical machine translation
WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Encouraging consistent translation choices
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Hi-index | 0.00 |
We study challenges raised by the order of Arabic verbs and their subjects in statistical machine translation (SMT). We show that the boundaries of post-verbal subjects (VS) are hard to detect accurately, even with a state-of-the-art Arabic dependency parser. In addition, VS constructions have highly ambiguous reordering patterns when translated to English, and these patterns are very different for matrix (main clause) VS and non-matrix (subordinate clause) VS. Based on this analysis, we propose a novel method for leveraging VS information in SMT: we reorder VS constructions into pre-verbal (SV) order for word alignment. Unlike previous approaches to source-side reordering, phrase extraction and decoding are performed using the original Arabic word order. This strategy significantly improves BLEU and TER scores, even on a strong large-scale baseline. Limiting reordering to matrix VS yields further improvements.