Joshua: an open source toolkit for parsing-based machine translation

  • Authors:
  • Zhifei Li;Chris Callison-Burch;Chris Dyer;Juri Ganitkevitch;Sanjeev Khudanpur;Lane Schwartz;Wren N. G. Thornton;Jonathan Weese;Omar F. Zaidan

  • Affiliations:
  • Johns Hopkins University, Baltimore, MD;Johns Hopkins University, Baltimore, MD;University of Maryland, College Park, MD;RWTH Aachen University, Germany;Johns Hopkins University, Baltimore, MD;University of Minnesota, Minneapolis, MN;Johns Hopkins University, Baltimore, MD;Johns Hopkins University, Baltimore, MD;Johns Hopkins University, Baltimore, MD

  • Venue:
  • StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe Joshua, an open source toolkit for statistical machine translation. Joshua implements all of the algorithms required for synchronous context free grammars (SCFGs): chart-parsing, n-gram language model integration, beam-and cube-pruning, and k-best extraction. The toolkit also implements suffix-array grammar extraction and minimum error rate training. It uses parallel and distributed computing techniques for scalability. We demonstrate that the toolkit achieves state of the art translation performance on the WMT09 French-English translation task.