Diversify and combine: improving word alignment for machine translation on low-resource languages

  • Authors:
  • Bing Xiang;Yonggang Deng;Bowen Zhou

  • Affiliations:
  • IBM T. J. Watson Research Center, Yorktown Heights, NY;IBM T. J. Watson Research Center, Yorktown Heights, NY;IBM T. J. Watson Research Center, Yorktown Heights, NY

  • Venue:
  • ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a novel method to improve word alignment quality and eventually the translation performance by producing and combining complementary word alignments for low-resource languages. Instead of focusing on the improvement of a single set of word alignments, we generate multiple sets of diversified alignments based on different motivations, such as linguistic knowledge, morphology and heuristics. We demonstrate this approach on an English-to-Pashto translation task by combining the alignments obtained from syntactic reordering, stemming, and partial words. The combined alignment outperforms the baseline alignment, with significantly higher F-scores and better translation performance.