The Alignment Template Approach to Statistical Machine Translation
Computational Linguistics
The Proposition Bank: An Annotated Corpus of Semantic Roles
Computational Linguistics
Improved statistical machine translation using paraphrases
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Moses: open source toolkit for statistical machine translation
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Syntactic constraints on paraphrases extracted from parallel corpora
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Parallel implementations of word alignment tool
SETQA-NLP '08 Software Engineering, Testing, and Quality Assurance for Natural Language Processing
CMU Haitian Creole-English translation system for WMT 2011
WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
A Bayesian approach to unsupervised semantic role induction
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Crosslingual induction of semantic roles
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Enriching parallel corpora for statistical machine translation with semantic negation rephrasing
SSST-6 '12 Proceedings of the Sixth Workshop on Syntax, Semantics and Structure in Statistical Translation
Multilingual joint parsing of syntactic and semantic dependencies with a latent variable model
Computational Linguistics
Hi-index | 0.00 |
We present an approach of expanding parallel corpora for machine translation. By utilizing Semantic role labeling (SRL) on one side of the language pair, we extract SRL substitution rules from existing parallel corpus. The rules are then used for generating new sentence pairs. An SVM classifier is built to filter the generated sentence pairs. The filtered corpus is used for training phrase-based translation models, which can be used directly in translation tasks or combined with baseline models. Experimental results on Chinese-English machine translation tasks show an average improvement of 0.45 BLEU and 1.22 TER points across 5 different NIST test sets.