Statistical machine translation using coercive two-level syntactic transduction

Authors:
Charles Schafer;David Yarowsky
Affiliations:
Johns Hopkins University, Baltimore, MD;Johns Hopkins University, Baltimore, MD
Venue:
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Year:
2003

Citing 13
Cited 5

Learning dependency translation models as collections of finite-state head transducers

Computational Linguistics - Special issue on finite-state methods in NLP
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Stochastic inversion transduction grammars and bilingual parsing of parallel corpora

Computational Linguistics
A program for aligning sentences in bilingual corpora

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
A comparison of alignment models for statistical machine translation

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
A syntax-based statistical translation model

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
A decoder for syntax-based statistical MT

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Transformation-based learning in the fast lane

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
A categorial variation database for English

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Improved statistical alignment models

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Stochastic finite-state models for spoken language machine translation

NAACL-ANLP-EMTS '00 Proceedings of the 2000 NAACL-ANLP Workshop on Embedded machine translation systems - Volume 5
Bootstrapping a multilingual part-of-speech tagger in one person-day

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20

A weighted finite state transducer translation template model for statistical machine translation

Natural Language Engineering
A localized prediction model for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
A block bigram prediction model for statistical machine translation

ACM Transactions on Speech and Language Processing (TSLP)
Combining linguistic data views for phrase-based SMT

ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
The LDV-COMBO system for SMT

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

We define, implement and evaluate a novel model for statistical machine translation, which is based on shallow syntactic analysis (part-of-speech tagging and phrase chunking) in both the source and target languages. It is able to model long-distance constituent motion and other syntactic phenomena without requiring a full parse in either language. We also examine aspects of lexical transfer, suggesting and exploring a concept of translation coercion across parts of speech, as well as a transfer model based on lemma-to-lemma translation probabilities, which holds promise for improving machine translation of low-density languages. Experiments are performed in both Arabic-to-English and French-to-English translation demonstrating the efficacy of the proposed techniques. Performance is automatically evaluated via the Bleu score metric.