Toward hierarchical models for statistical machine translation of inflected languages

  • Authors:
  • Sonja Nießen;Hermann Ney

  • Affiliations:
  • University of Technology, Aachen, Germany;University of Technology, Aachen, Germany

  • Venue:
  • DMMT '01 Proceedings of the workshop on Data-driven methods in machine translation - Volume 14
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

In statistical machine translation, correspondences between the words in the source and the target language are learned from bilingual corpora on the basis of so called alignment models. Existing statistical systems for MT often treat different derivatives of the same lemma as if they were independent of each other. In this paper we argue that a better exploitation of the bilingual training data can be achieved by explicitly taking into account the interdependencies of the different derivatives. We do this along two directions: Usage of hierarchical lexicon models and the introduction of equivalence classes in order to ignore information not relevant for the translation task. The improvement of the translation results is demonstrated on a German-English corpus.