Linguistically annotated BTG for statistical machine translation

  • Authors:
  • Deyi Xiong;Min Zhang;Aiti Aw;Haizhou Li

  • Affiliations:
  • Institute for Infocomm Research, Singapore;Institute for Infocomm Research, Singapore;Institute for Infocomm Research, Singapore;Institute for Infocomm Research, Singapore

  • Venue:
  • COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

Bracketing Transduction Grammar (BTG) is a natural choice for effective integration of desired linguistic knowledge into statistical machine translation (SMT). In this paper, we propose a Linguistically Annotated BTG (LABTG) for SMT. It conveys linguistic knowledge of source-side syntax structures to BTG hierarchical structures through linguistic annotation. From the linguistically annotated data, we learn annotated BTG rules and train linguistically motivated phrase translation model and reordering model. We also present an annotation algorithm that captures syntactic information for BTG nodes. The experiments show that the LABTG approach significantly outperforms a baseline BTG-based system and a state-of-the-art phrase-based system on the NIST MT-05 Chinese-to-English translation task. Moreover, we empirically demonstrate that the proposed method achieves better translation selection and phrase reordering.