Tighter integration of rule-based and statistical MT in serial system combination

  • Authors:
  • Nicola Ueffing;Jens Stephan;Evgeny Matusov;Loïc Dugast;George Foster;Roland Kuhn;Jean Senellart;Jin Yang

  • Affiliations:
  • National Research Council of Canada (NRC), Gatineau, Québec, Canada;SYSTRAN SA, Paris, France;RWTH Aachen University, Aachen, Germany;SYSTRAN SA, Paris, France;National Research Council of Canada (NRC), Gatineau, Québec, Canada;National Research Council of Canada (NRC), Gatineau, Québec, Canada;SYSTRAN SA, Paris, France;SYSTRAN SA, Paris, France

  • Venue:
  • COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recent papers have described machine translation (MT) based on an automatic post-editing or serial combination strategy whereby the input language is first translated into the target language by a rule-based MT (RBMT) system, then the target language output is automatically post-edited by a phrase-based statistical machine translation (SMT) system. This approach has been shown to improve MT quality over RBMT or SMT alone. In this previous work, there was a very loose coupling between the two systems: the SMT system only had access to the final 1-best translations from RBMT. Furthermore, the previous work involved European language pairs and relatively small training corpora. In this paper, we describe a more tightly integrated serial combination for the Chinese-to-English MT task. We will present experimental evaluation results on the 2008 NIST constrained data track where a significant gain in terms of both automatic and subjective metrics is achieved through the tighter coupling of the two systems.