Sub-sentence division for tree-based machine translation

  • Authors:
  • Hao Xiong;Wenwen Xu;Haitao Mi;Yang Liu;Qun Liu

  • Affiliations:
  • Chinese Academy of Sciences, Beijing, China;Chinese Academy of Sciences, Beijing, China;Chinese Academy of Sciences, Beijing, China;Chinese Academy of Sciences, Beijing, China;Chinese Academy of Sciences, Beijing, China

  • Venue:
  • ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Tree-based statistical machine translation models have made significant progress in recent years, especially when replacing 1-best trees with packed forests. However, as the parsing accuracy usually goes down dramatically with the increase of sentence length, translating long sentences often takes long time and only produces degenerate translations. We propose a new method named sub-sentence division that reduces the decoding time and improves the translation quality for tree-based translation. Our approach divides long sentences into several sub-sentences by exploiting tree structures. Large-scale experiments on the NIST 2008 Chinese-to-English test set show that our approach achieves an absolute improvement of 1.1 BLEU points over the baseline system in 50% less time.