Phrasal syntactic category sequence model for phrase-based MT

  • Authors:
  • Hailong Cao;Eiichiro Sumita;Tiejun Zhao;Sheng Li

  • Affiliations:
  • Harbin Institute of Technology, China;National Institute of Information and Communications Technology, Japan;Harbin Institute of Technology, China;Harbin Institute of Technology, China

  • Venue:
  • CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part II
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Incorporating target syntax into phrase-based machine translation (PBMT) can generate syntactically well-formed translations. We propose a novel phrasal syntactic category sequence (PSCS) model which allows a PBMT decoder to prefer more grammatical translations. We parse all the sentences on the target side of the bilingual training corpus. In the standard phrase pair extraction procedure, we assign a syntactic category to each phrase pair and build a PSCS model from the parallel training data. Then, we log linearly incorporate the PSCS model into a standard PBMT system. Our method is very simple and yields a 0.7 BLEU point improvement when compared to the baseline PBMT system.