Joshua 2.0: a toolkit for parsing-based machine translation with syntax, semirings, discriminative training and other goodies

  • Authors:
  • Zhifei Li;Chris Callison-Burch;Chris Dyer;Juri Ganitkevitch;Ann Irvine;Sanjeev Khudanpur;Lane Schwartz;Wren N. G. Thornton;Ziyuan Wang;Jonathan Weese;Omar F. Zaidan

  • Affiliations:
  • Johns Hopkins University, Baltimore, MD;Johns Hopkins University, Baltimore, MD;University of Maryland, College Park, MD;Johns Hopkins University, Baltimore, MD;Johns Hopkins University, Baltimore, MD;Johns Hopkins University, Baltimore, MD;University of Minnesota, Minneapolis, MN;Johns Hopkins University, Baltimore, MD;Johns Hopkins University, Baltimore, MD;Johns Hopkins University, Baltimore, MD;Johns Hopkins University, Baltimore, MD

  • Venue:
  • WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe the progress we have made in the past year on Joshua (Li et al., 2009a), an open source toolkit for parsing based machine translation. The new functionality includes: support for translation grammars with a rich set of syntactic nonterminals, the ability for external modules to posit constraints on how spans in the input sentence should be translated, lattice parsing for dealing with input uncertainty, a semiring framework that provides a unified way of doing various dynamic programming calculations, variational decoding for approximating the intractable MAP decoding, hypergraph-based discriminative training for better feature engineering, a parallelized MERT module, documentlevel and tail-based MERT, visualization of the derivation trees, and a cleaner pipeline for MT experiments.