Minimum error rate training in statistical machine translation
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Statistical Machine Translation with Scarce Resources Using Morpho-syntactic Information
Computational Linguistics
Arabic tokenization, part-of-speech tagging and morphological disambiguation in one fell swoop
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Improving statistical MT through morphological analysis
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Morphological analysis for statistical machine translation
HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
PORTAGE: a phrase-based machine translation system
ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
On the impact of morphology in English to Spanish statistical MT
Speech Communication
Segmentation for English-to-Arabic statistical machine translation
HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Syntactic phrase reordering for English-to-Arabic statistical machine translation
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Combination of statistical word alignments based on multiple preprocessing schemes
NAACL-Short '07 Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
Context-dependent alignment models for statistical machine translation
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Using a maximum entropy model to build segmentation lattices for MT
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Syntactic reordering for English-Arabic phrase-based machine translation
Semitic '09 Proceedings of the EACL 2009 Workshop on Computational Approaches to Semitic Languages
Coupling hierarchical word reordering and decoding in phrase-based statistical machine translation
SSST '09 Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation
Optimizing Chinese word segmentation for machine translation performance
StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
A Gibbs sampler for phrasal synchronous grammar induction
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Overview of Morpho challenge 2008
CLEF'08 Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access
Improved models of distortion cost for statistical machine translation
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Unsupervised search for the optimal segmentation for statistical machine translation
ACLstudent '10 Proceedings of the ACL 2010 Student Research Workshop
Better Arabic parsing: baselines, evaluations, and analysis
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Using TectoMT as a preprocessing tool for phrase-based statistical machine translation
TSD'10 Proceedings of the 13th international conference on Text, speech and dialogue
Factored bilingual n-gram language models for statistical machine translation
Machine Translation
Syntax-based reordering for statistical machine translation
Computer Speech and Language
Unsupervised word alignment with arbitrary features
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Translating from morphologically complex languages: a paraphrase-based approach
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Two ways to use a noisy parallel news corpus for improving statistical machine translation
BUCC '11 Proceedings of the 4th Workshop on Building and Using Comparable Corpora: Comparable Corpora and the Web
Dialectal to standard Arabic paraphrasing to improve Arabic-English statistical machine translation
DIALECTS '11 Proceedings of the First Workshop on Algorithms and Resources for Modelling of Dialects and Language Varieties
Machine translation between Hebrew and Arabic
Machine Translation
ACC'11/MMACTEE'11 Proceedings of the 13th IASME/WSEAS international conference on Mathematical Methods and Computational Techniques in Electrical Engineering conference on Applied Computing
CIMMACS'11/ISP'11 Proceedings of the 10th WSEAS international conference on Computational Intelligence, Man-Machine Systems and Cybernetics, and proceedings of the 10th WSEAS international conference on Information Security and Privacy
Machine translation of Arabic dialects
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
A class-based agreement model for generating accurately inflected translations
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Twitter translation using translation-based cross-lingual retrieval
WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
An empirical study on word segmentation for chinese machine translation
CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 2
Oracle decoding as a new way to analyze phrase-based machine translation
Machine Translation
Hi-index | 0.00 |
In this paper, we study the effect of different word-level preprocessing decisions for Arabic on SMT quality. Our results show that given large amounts of training data, splitting off only proclitics performs best. However, for small amounts of training data, it is best to apply English-like to-kenization using part-of-speech tags, and sophisticated morphological analysis and disambiguation. Moreover, choosing the appropriate preprocessing produces a significant increase in BLEU score if there is a change in genre between training and test data.