GREAT: open source software for statistical machine translation

  • Authors:
  • Jorge González;Francisco Casacuberta

  • Affiliations:
  • Departamento de Sistemas Informáticos y Computación, Instituto Tecnológico de Informática, Universitat Politècnica de València, València, Spain;Departamento de Sistemas Informáticos y Computación, Instituto Tecnológico de Informática, Universitat Politècnica de València, València, Spain

  • Venue:
  • Machine Translation
  • Year:
  • 2011

Quantified Score

Hi-index 0.04

Visualization

Abstract

In this article, the first public release of GREAT as an open-source, statistical machine translation (SMT) software toolkit is described. GREAT is based on a bilingual language modelling approach for SMT, which is so far implemented for n-gram models based on the framework of stochastic finite-state transducers. The use of finite-state models is motivated by their simplicity, their versatility, and the fact that they present a lower computational cost, if compared with other more expressive models. Moreover, if translation is assumed to be a subsequential process, finite-state models are enough for modelling the existing relations between a source and a target language. GREAT includes some characteristics usually present in state-of-the-art SMT, such as phrase-based translation models or a log-linear framework for local features. Experimental results on a well-known corpus such as Europarl are reported in order to validate this software. A competitive translation quality is achieved, yet using both a lower number of model parameters and a lower response time than the widely-used, state-of-the-art SMT system Moses.