A POS-based model for long-range reorderings in SMT

  • Authors:
  • Jan Niehues;Muntsin Kolss

  • Affiliations:
  • Universität Karlsruhe, Karlsruhe, Germany;Universität Karlsruhe, Karlsruhe, Germany

  • Venue:
  • StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we describe a new approach to model long-range word reorderings in statistical machine translation (SMT). Until now, most SMT approaches are only able to model local reorderings. But even the word order of related languages like German and English can be very different. In recent years approaches that reorder the source sentence in a preprocessing step to better match target sentences according to POS(Part-of-Speech)-based rules have been applied successfully. We enhance this approach to model long-range reorderings by introducing discontinuous rules. We tested this new approach on a German-English translation task and could significantly improve the translation quality, by up to 0.8 BLEU points, compared to a system which already uses continuous POS-based rules to model short-range reorderings.