BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Statistical Machine Translation with Scarce Resources Using Morpho-syntactic Information
Computational Linguistics
Modelling lexical redundancy for machine translation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Improving statistical MT through morphological analysis
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Learning morphological disambiguation rules for Turkish
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Moses: open source toolkit for statistical machine translation
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Morphological analysis for statistical machine translation
HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
Bridging the inflection morphology gap for Arabic statistical machine translation
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Rule-based translation with statistical phrase-based post-editing
StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Initial explorations in English to Turkish statistical machine translation
StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
N-gram posterior probabilities for statistical machine translation
StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
Using a maximum entropy model to build segmentation lattices for MT
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Statistical machine translation into a morphologically complex language
CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Unsupervised search for the optimal segmentation for statistical machine translation
ACLstudent '10 Proceedings of the ACL 2010 Student Research Workshop
IEEE Transactions on Audio, Speech, and Language Processing
A hybrid morpheme-word representation for machine translation of morphologically rich languages
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
EMMA: a novel Evaluation Metric for Morphological Analysis
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Simulating morphological analyzers with stochastic taggers for confidence estimation
CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments
Crisis MT: developing a cookbook for MT in crisis situations
WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Hi-index | 0.00 |
We investigate different representational granularities for sub-lexical representation in statistical machine translation work from English to Turkish. We find that (i) representing both Turkish and English at the morpheme-level but with some selective morpheme-grouping on the Turkish side of the training data, (ii) augmenting the training data with "sentences" comprising only the content words of the original training data to bias root word alignment, (iii) reranking the n-best morpheme-sequence outputs of the decoder with a word-based language model, and (iv) using model iteration all provide a non-trivial improvement over a fully word-based baseline. Despite our very limited training data, we improve from 20.22 BLEU points for our simplest model to 25.08 BLEU points for an improvement of 4.86 points or 24% relative.