Improving phrase-based statistical machine translation with morphosyntactic transformation

  • Authors:
  • Thai Phuong Nguyen;Akira Shimazu

  • Affiliations:
  • School of Information Science, Japan Advanced Institute of Science and Technology, Ishikawa, Japan 923-1292;School of Information Science, Japan Advanced Institute of Science and Technology, Ishikawa, Japan 923-1292

  • Venue:
  • Machine Translation
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a phrase-based statistical machine translation approach which uses linguistic analysis in the preprocessing phase. The linguistic analysis includes morphological transformation and syntactic transformation. Since the word-order problem is solved using syntactic transformation, there is no reordering in the decoding phase. For morphological transformation, we use hand-crafted transformational rules. For syntactic transformation, we propose a transformational model based on a probabilistic context-free grammar. This model is trained using a bilingual corpus and a broad-coverage parser of the source language. This approach is applicable to language pairs in which the target language is poor in resources. We considered translation from English to Vietnamese and from English to French. Our experiments showed significant BLEU-score improvements in comparison with Pharaoh, a state-of-the-art phrase-based SMT system.