Statistical machine translation of texts with misspelled words

  • Authors:
  • Nicola Bertoldi;Mauro Cettolo;Marcello Federico

  • Affiliations:
  • FBK - Fondazione Bruno Kessler, Trento, Italy;FBK - Fondazione Bruno Kessler, Trento, Italy;FBK - Fondazione Bruno Kessler, Trento, Italy

  • Venue:
  • HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper investigates the impact of misspelled words in statistical machine translation and proposes an extension of the translation engine for handling misspellings. The enhanced system decodes a word-based confusion network representing spelling variations of the input text. We present extensive experimental results on two translation tasks of increasing complexity which show how misspellings of different types do affect performance of a statistical machine translation decoder and to what extent our enhanced system is able to recover from such errors.