One translation per discourse

  • Authors:
  • Marine Carpuat

  • Affiliations:
  • Columbia University, New York, NY

  • Venue:
  • DEW '09 Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We revisit the one sense per discourse hypothesis of Gale et al. in the context of machine translation. Since a given sense can be lexicalized differently in translation, do we observe one translation per discourse? Analysis of manual translations reveals that the hypothesis still holds when using translations in parallel text as sense annotation, thus confirming that translational differences represent useful sense distinctions. Analysis of Statistical Machine Translation (SMT) output showed that despite ignoring document structure, the one translation per discourse hypothesis is strongly supported in part because of the low variability in SMT lexical choice. More interestingly, cases where the hypothesis does not hold can reveal lexical choice errors. A preliminary study showed that enforcing the one translation per discourse constraint in SMT can potentially improve translation quality, and that SMT systems might benefit from translating sentences within their entire document context.