Exploiting N-best hypotheses for SMT self-enhancement

  • Authors:
  • Boxing Chen;Min Zhang;Aiti Aw;Haizhou Li

  • Affiliations:
  • Institute for Infocomm Research, Heng Mui Keng Terrace, Singapore;Institute for Infocomm Research, Heng Mui Keng Terrace, Singapore;Institute for Infocomm Research, Heng Mui Keng Terrace, Singapore;Institute for Infocomm Research, Heng Mui Keng Terrace, Singapore

  • Venue:
  • HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Word and n-gram posterior probabilities estimated on N-best hypotheses have been used to improve the performance of statistical machine translation (SMT) in a rescoring framework. In this paper, we extend the idea to estimate the posterior probabilities on N-best hypotheses for translation phrase-pairs, target language n-grams, and source word reorderings. The SMT system is self-enhanced with the posterior knowledge learned from N-best hypotheses in a re-decoding framework. Experiments on NIST Chinese-to-English task show performance improvements for all the strategies. Moreover, the combination of the three strategies achieves further improvements and outperforms the baseline by 0.67 BLEU score on NIST-2003 set, and 0.64 on NIST-2005 set, respectively.