Improving phrase-based translation via word alignments from stochastic inversion transduction grammars

  • Authors:
  • Markus Saers;Dekai Wu

  • Affiliations:
  • Uppsala University, Sweden;Human Language Technology Center, HKUST, Hong Kong

  • Venue:
  • SSST '09 Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We argue that learning word alignments through a compositionally-structured, joint process yields higher phrase-based translation accuracy than the conventional heuristic of intersecting conditional models. Flawed word alignments can lead to flawed phrase translations that damage translation accuracy. Yet the IBM word alignments usually used today are known to be flawed, in large part because IBM models (1) model reordering by allowing unrestricted movement of words, rather than constrained movement of compositional units, and therefore must (2) attempt to compensate via directed, asymmetric distortion and fertility models. The conventional heuristics for attempting to recover from the resulting alignment errors involve estimating two directed models in opposite directions and then intersecting their alignments -- to make up for the fact that, in reality, word alignment is an inherently joint relation. A natural alternative is provided by Inversion Transduction Grammars, which estimate the joint word alignment relation directly, eliminating the need for any of the conventional heuristics. We show that this alignment ultimately produces superior translation accuracy on BLEU, NIST, and METEOR metrics over three distinct language pairs.