SyMGiza++: symmetrized word alignment models for statistical machine translation

  • Authors:
  • Marcin Junczys-Dowmunt;Arkadiusz Sza$#322/

  • Affiliations:
  • Faculty of Mathematics and Computer Science, Adam Mickiewicz University, Pozna$#324/, Poland;Faculty of Mathematics and Computer Science, Adam Mickiewicz University, Pozna$#324/, Poland

  • Venue:
  • SIIS'11 Proceedings of the 2011 international conference on Security and Intelligent Information Systems
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

SyMGiza++ -- a tool that computes symmetric word alignment models with the capability to take advantage of multi-processor systems -- is presented. A series of fairly simple modifications to the original IBM/Giza++ word alignment models allows to update the symmetrized models between chosen iterations of the original training algorithms. We achieve a relative alignment quality improvement of more than 17% compared to Giza++ and MGiza++ on the standard Canadian Hansards task, while maintaining the speed improvements provided by the capability of parallel computations of MGiza++. Furthermore, the alignment models are evaluated in the context of phrase-based statistical machine translation, where a consistent improvement measured in BLEU scores can be observed when SyMGiza++ is used instead of Giza++ or MGiza++.