Error driven paraphrase annotation using Mechanical Turk

Authors:
Olivia Buzek;Philip Resnik;Benjamin B. Bederson
Affiliations:
University of Maryland, College Park, MD;University of Maryland, College Park, MD;University of Maryland, College Park, MD
Venue:
CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
Year:
2010

Citing 8
Cited 10

Evaluating translational correspondence using annotation projection

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Improved statistical machine translation using paraphrases

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Syntactic constraints on paraphrases extracted from parallel corpora

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Using paraphrases for parameter tuning in statistical machine translation

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
The 'noisier channel': translation from morphologically complex languages

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Improved statistical machine translation using monolingually-derived paraphrases

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
TER-Plus: paraphrase, semantic, and alignment enhancements to Translation Edit Rate

Machine Translation
Translation by iterative collaboration between monolingual users

Proceedings of Graphics Interface 2010

Translation by iterative collaboration between monolingual users

Proceedings of Graphics Interface 2010
Creating speech and language data with Amazon's Mechanical Turk

CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
Improving translation via targeted paraphrasing

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Collecting highly parallel data for paraphrase evaluation

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Towards strict sentence intersection: decoding and evaluation strategies

MTTG '11 Proceedings of the Workshop on Monolingual Text-To-Text Generation
The value of monolingual crowdsourcing in a real-world translation scenario: simulation using Haitian Creole emergency SMS messages

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
CSAF: a community-sourcing annotation framework

LAW VI '12 Proceedings of the Sixth Linguistic Annotation Workshop
Using targeted paraphrasing and monolingual crowdsourcing to improve translation

ACM Transactions on Intelligent Systems and Technology (TIST) - Special Sections on Paraphrasing; Intelligent Systems for Socially Aware Computing; Social Computing, Behavioral-Cultural Modeling, and Prediction
Paraphrase acquisition via crowdsourcing and machine learning

ACM Transactions on Intelligent Systems and Technology (TIST) - Special Sections on Paraphrasing; Intelligent Systems for Socially Aware Computing; Social Computing, Behavioral-Cultural Modeling, and Prediction
N-gram posterior probability confidence measures for statistical machine translation: an empirical study

Machine Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

The source text provided to a machine translation system is typically only one of many ways the input sentence could have been expressed, and alternative forms of expression can often produce a better translation. We introduce here error driven paraphrasing of source sentences: instead of paraphrasing a source sentence exhaustively, we obtain paraphrases for only the parts that are predicted to be problematic for the translation system. We report on an Amazon Mechanical Turk study that explores this idea, and establishes via an oracle evaluation that it holds the potential to substantially improve translation quality.