A statistical approach to machine translation
Computational Linguistics
Integration of diverse recognition methodologies through reevaluation of N-best sentence hypotheses
HLT '91 Proceedings of the workshop on Speech and Natural Language
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Forest-based statistical sentence generation
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
The surprise language exercises
ACM Transactions on Asian Language Information Processing (TALIP)
BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Minimum error rate training in statistical machine translation
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
A phrase-based, joint probability model for statistical machine translation
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Paraphrasing with bilingual parallel corpora
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
The Hiero machine translation system: extensions, evaluation, and analysis
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Improved statistical machine translation using paraphrases
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Paraphrasing for automatic evaluation
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Hierarchical Phrase-Based Translation
Computational Linguistics
Statistical machine translation
ACM Computing Surveys (CSUR)
Syntactic constraints on paraphrases extracted from parallel corpora
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Using paraphrases for parameter tuning in statistical machine translation
StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Meteor: an automatic metric for MT evaluation with high levels of correlation with human judgments
StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Proceedings of the Third Workshop on Statistical Machine Translation
StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Further meta-evaluation of machine translation
StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Feasibility of human-in-the-loop minimum error rate training
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Statistical Machine Translation
Statistical Machine Translation
Turker-assisted paraphrasing for English-Arabic machine translation
CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
String-to-dependency statistical machine translation
Computational Linguistics
The circle of meaning: from translation to paraphrasing and back
The circle of meaning: from translation to paraphrasing and back
Sentiment profiles of multiword expressions in test-taker essays: The case of noun-noun compounds
ACM Transactions on Speech and Language Processing (TSLP) - Special issue on multiword expressions: From theory to practice and use, part 2
Hi-index | 0.00 |
Today's Statistical Machine Translation (SMT) systems require high-quality human translations for parameter tuning, in addition to large bitexts for learning the translation units. This parameter tuning usually involves generating translations at different points in the parameter space and obtaining feedback against human-authored reference translations as to how good the translations. This feedback then dictates what point in the parameter space should be explored next. To measure this feedback, it is generally considered wise to have multiple (usually 4) reference translations to avoid unfair penalization of translation hypotheses which could easily happen given the large number of ways in which a sentence can be translated from one language to another. However, this reliance on multiple reference translations creates a problem since they are labor intensive and expensive to obtain. Therefore, most current MT datasets only contain a single reference. This leads to the problem of reference sparsity. In our previously published research, we had proposed the first paraphrase-based solution to this problem and evaluated its effect on Chinese-English translation. In this article, we first present extended results for that solution on additional source languages. More importantly, we present a novel way to generate “targeted” paraphrases that yields substantially larger gains (up to 2.7 BLEU points) in translation quality when compared to our previous solution (up to 1.6 BLEU points). In addition, we further validate these improvements by supplementing with human preference judgments obtained via Amazon Mechanical Turk.