SMT experiments for Romanian and German using JRC-ACQUIS

  • Authors:
  • Monica Gavrila

  • Affiliations:
  • Hamburg University, Hamburg, Germany

  • Venue:
  • MRTECEEL '09 Proceedings of the Workshop on Multilingual Resources, Technologies and Evaluation for Central and Eastern European Languages
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

One of the LT1-applications that ensures the access to the information, in the user's mother tongue, is machine translation (MT). Unfortunately less spoken languages - a category in which the Balkan and Slavic languages can be included - have to overcome a major gap in language resources, reference-systems and tools. In its simplest form, statistical machine translation (SMT) is based only on the existence of a big parallel corpus and therefore it seems to be a solution for these languages. In this paper the performance of a Moses-based SMT system, for Romanian and German, is investigated using test data from two different domains - legislation (JRC-ACQUIS) and a manual of an electronic device. The obtained results are compared with the ones given by the Google on-line translation tool. An analysis of the obtained translation results gives an overview of the main challenges and sources of errors in translation, in these experimental settings.