DUSTer: A Method for Unraveling Cross-Language Divergences for Statistical Word-Level Alignment

  • Authors:
  • Bonnie J. Dorr;Lisa Pearl;Rebecca Hwa;Nizar Habash

  • Affiliations:
  • -;-;-;-

  • Venue:
  • AMTA '02 Proceedings of the 5th Conference of the Association for Machine Translation in the Americas on Machine Translation: From Research to Real Users
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

The frequent occurrence of divergences--structural differences between languages--presents a great challenge for statistical word-level alignment. In this paper, we introduce DUSTer, a method for systematically identifying common divergence types and transforming an English sentence structure to bear a closer resemblance to that of another language. Our ultimate goal is to enable more accurate alignment and projection of dependency trees in another language without requiring any training on dependency-tree data in that language. We present an empirical analysis comparing the complexities of performing word-level alignments with and without divergence handling. Our results suggest that our approach facilitates word-level alignment, particularly for sentence pairs containing divergences.