Bridging morpho-syntactic gap between source and target sentences for English-Korean statistical machine translation

  • Authors:
  • Gumwon Hong;Seung-Wook Lee;Hae-Chang Rim

  • Affiliations:
  • Korea University, Seoul, Korea;Korea University, Seoul, Korea;Korea University, Seoul, Korea

  • Venue:
  • ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Often, Statistical Machine Translation (SMT) between English and Korean suffers from null alignment. Previous studies have attempted to resolve this problem by removing unnecessary function words, or by reordering source sentences. However, the removal of function words can cause a serious loss in information. In this paper, we present a possible method of bridging the morpho-syntactic gap for English-Korean SMT. In particular, the proposed method tries to transform a source sentence by inserting pseudo words, and by reordering the sentence in such a way that both sentences have a similar length and word order. The proposed method achieves 2.4 increase in BLEU score over baseline phrase-based system.