A limited memory algorithm for bound constrained optimization
SIAM Journal on Scientific Computing
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Minimum error rate training in statistical machine translation
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
A hierarchical phrase-based model for statistical machine translation
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Triplet lexicon models for statistical machine translation
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
11,001 new features for statistical machine translation
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Preference grammars: softening syntactic constraints to improve statistical machine translation
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Comparison of extended lexicon models in search and rescoring for SMT
NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Joshua: an open source toolkit for parsing-based machine translation
StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
A quantitative analysis of reordering phenomena
StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Extending statistical machine translation with discriminative and trigger-based lexicon models
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Statistical Machine Translation
Statistical Machine Translation
Edit distances with block movements and error rate confidence estimates
Machine Translation
Training phrase translation models with leaving-one-out
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
The RWTH Aachen machine translation system for WMT 2010
WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
More linguistic annotation for statistical machine translation
WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Reproducible results in parsing-based machine translation: the JHU shared task submission
WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Jane: open source hierarchical translation, extended with reordering and lexicon models
WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Insertion and deletion models for statistical machine translation
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
The RWTH Aachen machine translation system for WMT 2012
WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Hi-index | 0.00 |
In this article we will describe the design and implementation of Jane, an efficient hierarchical phrase-based (HPB) toolkit developed at RWTH Aachen University. The system has been used by RWTH at several international evaluation campaigns, including the WMT and NIST evaluations, and is now freely available for non-commercial application. We will go through the main features of Jane, which include, among others, support for different search strategies, different language model formats, support for syntax-based enhancements to the HPB machine translation paradigm, string-to-dependency translation, extended lexicon models, different methods for minimum-error-rate training and distributed operation on a computer cluster. Special attention has been paid to the efficiency of the decoder, clean code and quality assurance through unit and regression testing. Results on current machine translation tasks are reported, which show that the system is able to obtain state-of-the-art performance.