Refined lexicon models for statistical machine translation using a maximum entropy approach

  • Authors:
  • Ismael García Varea;Franz J. Och;Hermann Ney;Francisco Casacuberta

  • Affiliations:
  • Univ. de Castilla-La Mancha, Albacete, Spain;Lehrstuhl für Inf. VI, RWTH Aachen, Ahornstr., Aachen, Germany;Lehrstuhl für Inf. VI, RWTH Aachen, Ahornstr., Aachen, Germany;Inst. Tecn. de Inf. (UPV), Valencia, Spain

  • Venue:
  • ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Typically, the lexicon models used in statistical machine translation systems do not include any kind of linguistic or contextual information, which often leads to problems in performing a correct word sense disambiguation. One way to deal with this problem within the statistical framework is to use maximum entropy methods. In this paper, we present how to use this type of information within a statistical machine translation system. We show that it is possible to significantly decrease training and test corpus perplexity of the translation models. In addition, we perform a rescoring of N-Best lists using our maximum entropy model and thereby yield an improvement in translation quality. Experimental results are presented on the so-called "Verbmobil Task".