Syntax-based alignment: supervised or unsupervised?

  • Authors:
  • Hao Zhang;Daniel Gildea

  • Affiliations:
  • University of Rochester, Rochester, NY;University of Rochester, Rochester, NY

  • Venue:
  • COLING '04 Proceedings of the 20th international conference on Computational Linguistics
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Tree-based approaches to alignment model translation as a sequence of probabilistic operations transforming the syntactic parse tree of a sentence in one language into that of the other. The trees may be learned directly from parallel corpora (Wu, 1997), or provided by a parser trained on hand-annotated treebanks (Yamada and Knight, 2001). In this paper, we compare these approaches on Chinese-English and French-English datasets, and find that automatically derived trees result in better agreement with human-annotated word-level alignments for unseen test data.