N-gram posterior probability confidence measures for statistical machine translation: an empirical study

  • Authors:
  • Adrià Gispert;Graeme Blackwood;Gonzalo Iglesias;William Byrne

  • Affiliations:
  • Machine Intelligence Laboratory, Department of Engineering, Cambridge University, Cambridge, UK;IBM T.J. Watson Research, Yorktown Heights, USA 10598;Machine Intelligence Laboratory, Department of Engineering, Cambridge University, Cambridge, UK;Machine Intelligence Laboratory, Department of Engineering, Cambridge University, Cambridge, UK

  • Venue:
  • Machine Translation
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

We report an empirical study of n-gram posterior probability confidence measures for statistical machine translation (SMT). We first describe an efficient and practical algorithm for rapidly computing n-gram posterior probabilities from large translation word lattices. These probabilities are shown to be a good predictor of whether or not the n-gram is found in human reference translations, motivating their use as a confidence measure for SMT. Comprehensive n-gram precision and word coverage measurements are presented for a variety of different language pairs, domains and conditions. We analyze the effect on reference precision of using single or multiple references, and compare the precision of posteriors computed from k-best lists to those computed over the full evidence space of the lattice. We also demonstrate improved confidence by combining multiple lattices in a multi-source translation framework.