cdec: a decoder, alignment, and learning framework for finite-state and context-free translation models

  • Authors:
  • Chris Dyer;Jonathan Weese;Hendra Setiawan;Adam Lopez;Ferhan Ture;Vladimir Eidelman;Juri Ganitkevitch;Phil Blunsom;Philip Resnik

  • Affiliations:
  • University of Maryland;Johns Hopkins University;University of Maryland;University of Edinburgh;University of Maryland;University of Maryland;Johns Hopkins University;Oxford University;University of Maryland

  • Venue:
  • ACLDemos '10 Proceedings of the ACL 2010 System Demonstrations
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present cdec, an open source framework for decoding, aligning with, and training a number of statistical machine translation models, including word-based models, phrase-based models, and models based on synchronous context-free grammars. Using a single unified internal representation for translation forests, the decoder strictly separates model-specific translation logic from general rescoring, pruning, and inference algorithms. From this unified representation, the decoder can extract not only the 1- or k-best translations, but also alignments to a reference, or the quantities necessary to drive discriminative training using gradient-based or gradient-free optimization techniques. Its efficient C++ implementation means that memory use and runtime performance are significantly better than comparable decoders.