Imposing constraints from the source tree on ITG constraints for SMT

  • Authors:
  • Hirofumi Yamamoto;Hideo Okuma;Eiichiro Sumita

  • Affiliations:
  • National Institute of Information and Communications Technology, Soraku-gun, Kyoto, Japan and ATR Spoken Language Communication Research Labs and Kinki University;National Institute of Information and Communications Technology, Soraku-gun, Kyoto, Japan and ATR Spoken Language Communication Research Labs;National Institute of Information and Communications Technology, Soraku-gun, Kyoto, Japan and ATR Spoken Language Communication Research Labs

  • Venue:
  • SSST '08 Proceedings of the Second Workshop on Syntax and Structure in Statistical Translation
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In current statistical machine translation (SMT), erroneous word reordering is one of the most serious problems. To resolve this problem, many word-reordering constraint techniques have been proposed. The inversion transduction grammar (ITG) is one of these constraints. In ITG constraints, target-side word order is obtained by rotating nodes of the source-side binary tree. In these node rotations, the source binary tree instance is not considered. Therefore, stronger constraints for word reordering can be obtained by imposing further constraints derived from the source tree on the ITG constraints. For example, for the source word sequence { a b c d }, ITG constraints allow a total of twenty-two target word orderings. However, when the source binary tree instance ((a b) (c d)) is given, our proposed "imposing source tree on ITG" (IST-ITG) constraints allow only eight word orderings. The reduction in the number of word-order permutations by our proposed stronger constraints efficiently suppresses erroneous word orderings. In our experiments with IST-ITG using the NIST MT08 English-to-Chinese translation track's data, the proposed method resulted in a 1.8-points improvement in character BLEU-4 (35.2 to 37.0) and a 6.2% lower CER (74.1 to 67.9%) compared with our baseline condition.