Translating with non-contiguous phrases

  • Authors:
  • Michel Simard;Nicola Cancedda;Bruno Cavestro;Marc Dymetman;Eric Gaussier;Cyril Goutte;Kenji Yamada;Philippe Langlais;Arne Mauser

  • Affiliations:
  • Xerox Research Centre Europe;Xerox Research Centre Europe;Xerox Research Centre Europe;Xerox Research Centre Europe;Xerox Research Centre Europe;Xerox Research Centre Europe;Xerox Research Centre Europe;RALI/DIRO Université de Montréal;RWTH Aachen University

  • Venue:
  • HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a phrase-based statistical machine translation method, based on non-contiguous phrases, i.e. phrases with gaps. A method for producing such phrases from a word-aligned corpora is proposed. A statistical translation model is also presented that deals such phrases, as well as a training method based on the maximization of translation accuracy, as measured with the NIST evaluation metric. Translations are produced by means of a beam-search decoder. Experimental results are presented, that demonstrate how the proposed method allows to better generalize from the training data.