Syntax-based alignment: supervised or unsupervised?

Authors:
Hao Zhang;Daniel Gildea
Affiliations:
University of Rochester, Rochester, NY;University of Rochester, Rochester, NY
Venue:
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Year:
2004

Citing 13
Cited 12

A statistical approach to machine translation

Computational Linguistics
Machine translation divergences: a formal description and proposed solution

Computational Linguistics
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Stochastic inversion transduction grammars and bilingual parsing of parallel corpora

Computational Linguistics
Building a large-scale annotated Chinese corpus

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
A decoder for syntax-based statistical MT

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Evaluating translational correspondence using annotation projection

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Loosely tree-based alignment for machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
A comparative study on reordering constraints in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Improved statistical alignment models

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Phrasal cohesion and statistical machine translation

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
An evaluation exercise for word alignment

HLT-NAACL-PARALLEL '03 Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond - Volume 3
Design of a multi-lingual, parallel-processing statistical parsing engine

HLT '02 Proceedings of the second international conference on Human Language Technology Research

Stochastic lexicalized inversion transduction grammar for alignment

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Empirical lower bounds on the complexity of translational equivalence

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Soft syntactic constraints for word alignment through discriminative training

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Inducing word alignments with bilexical synchronous trees

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Inversion transduction grammar for joint phrasal translation modeling

SSST '07 Proceedings of the NAACL-HLT 2007/AMTA Workshop on Syntax and Structure in Statistical Translation
Empirical lower bounds on alignment error rates in syntax-based machine translation

SSST '09 Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation
On the complexity of alignment problems in two synchronous grammar formalisms

SSST '09 Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation
Recognizing paraphrases and textual entailment using inversion transduction grammars

EMSEE '05 Proceedings of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment
Empirical lower bounds on translation unit error rate for the full class of inversion transduction grammars

IWPT '09 Proceedings of the 11th International Conference on Parsing Technologies
Unsupervised syntactic alignment with inversion transduction grammars

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Textual entailment recognition using inversion transduction grammars

MLCW'05 Proceedings of the First international conference on Machine Learning Challenges: evaluating Predictive Uncertainty Visual Object Classification, and Recognizing Textual Entailment
Inversion transduction grammar constraints for mining parallel sentences from quasi-comparable corpora

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Tree-based approaches to alignment model translation as a sequence of probabilistic operations transforming the syntactic parse tree of a sentence in one language into that of the other. The trees may be learned directly from parallel corpora (Wu, 1997), or provided by a parser trained on hand-annotated treebanks (Yamada and Knight, 2001). In this paper, we compare these approaches on Chinese-English and French-English datasets, and find that automatically derived trees result in better agreement with human-annotated word-level alignments for unseen test data.