Combining morphosyntactic enriched representation with n-best reranking in statistical translation

  • Authors:
  • H. Bonneau-Maynard;A. Allauzen;D. Déchelotte;H. Schwenk

  • Affiliations:
  • LIMSI-CNRS, Orsay cedex, France;LIMSI-CNRS, Orsay cedex, France;LIMSI-CNRS, Orsay cedex, France;LIMSI-CNRS, Orsay cedex, France

  • Venue:
  • SSST '07 Proceedings of the NAACL-HLT 2007/AMTA Workshop on Syntax and Structure in Statistical Translation
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The purpose of this work is to explore the integration of morphosyntactic information into the translation model itself, by enriching words with their morphosyntactic categories. We investigate word disambiguation using morphosyntactic categories, n-best hypotheses reranking, and the combination of both methods with word or morphosyntactic n-gram language model reranking. Experiments are carried out on the English-to-Spanish translation task. Using the morphosyntactic language model alone does not results in any improvement in performance. However, combining morphosyntactic word disambiguation with a word based 4-gram language model results in a relative improvement in the BLEU score of 2.3% on the development set and 1.9% on the test set.