BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Minimum error rate training in statistical machine translation
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Learning non-isomorphic tree mappings for machine translation
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 2
Improved statistical alignment models
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Dependency treelet translation: syntactically informed phrasal SMT
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Tree-to-string alignment template for statistical machine translation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Scalable inference and training of context-rich syntactic translation models
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Hierarchical Phrase-Based Translation
Computational Linguistics
Distributed language modeling for N-best list re-ranking
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Rule filtering by pattern for efficient hierarchical translation
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Hierarchical phrase-based translation with weighted finite state transducers
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Efficient extraction of oracle-best translations from hypergraphs
NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Joshua: an open source toolkit for parsing-based machine translation
StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Arabic cross-document coreference detection
ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Demonstration of Joshua: an open source toolkit for parsing-based machine translation
ACLDemos '09 Proceedings of the ACL-IJCNLP 2009 Software Demonstrations
Variational decoding for statistical machine translation
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Better word alignments with supervised ITG models
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Joint parsing and alignment with weakly synchronized grammars
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Model combination for machine translation
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Hierarchical phrase-based machine translation with word-based reordering model
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Faster and smaller N-gram language models
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Fast inference in phrase extraction models with belief propagation
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Language model rest costs and space-efficient storage
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Hi-index | 0.00 |
We describe a scalable decoder for parsing-based machine translation. The decoder is written in JAVA and implements all the essential algorithms described in Chiang (2007): chart-parsing, m-gram language model integration, beam- and cube-pruning, and unique k-best extraction. Additionally, parallel and distributed computing techniques are exploited to make it scalable. We also propose an algorithm to maintain equivalent language model states that exploits the back-off property of m-gram language models: instead of maintaining a separate state for each distinguished sequence of "state" words, we merge multiple states that can be made equivalent for language model probability calculations due to back-off. We demonstrate experimentally that our decoder is more than 30 times faster than a baseline decoder written in PYTHON. We propose to release our decoder as an open-source toolkit.