Grammar comparison study for translational equivalence modeling and statistical machine translation

  • Authors:
  • Min Zhang;Hongfei Jiang;Haizhou Li;Aiti Aw;Sheng Li

  • Affiliations:
  • Institute for Infocomm Research, Singapore;Harbin Institute of Technology, China;Institute for Infocomm Research, Singapore;Institute for Infocomm Research, Singapore;Harbin Institute of Technology, China

  • Venue:
  • COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a general platform, namely synchronous tree sequence substitution grammar (STSSG), for the grammar comparison study in Translational Equivalence Modeling (TEM) and Statistical Machine Translation (SMT). Under the STSSG platform, we compare the expressive abilities of various grammars through synchronous parsing and a real translation platform on a variety of Chinese-English bilingual corpora. Experimental results show that the STSSG is able to better explain the data in parallel corpora than other grammars. Our study further finds that the complexity of structure divergence is much higher than suggested in literature, which imposes a big challenge to syntactic transformation-based SMT.