Extending statistical machine translation with discriminative and trigger-based lexicon models

Authors:
Arne Mauser;Saša Hasan;Hermann Ney
Affiliations:
RWTH Aachen University, Germany;RWTH Aachen University, Germany;RWTH Aachen University, Germany
Venue:
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Year:
2009

Citing 7
Cited 18

A limited memory algorithm for bound constrained optimization

SIAM Journal on Scientific Computing
Models of translational equivalence among words

Computational Linguistics
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Refined lexicon models for statistical machine translation using a maximum entropy approach

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Triplet lexicon models for statistical machine translation

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Comparison of extended lexicon models in search and rescoring for SMT

NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Three models for discriminative machine translation using global lexical selection and sentence reconstruction

SSST '07 Proceedings of the NAACL-HLT 2007/AMTA Workshop on Syntax and Structure in Statistical Translation

Bilingual sense similarity for statistical machine translation

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
The RWTH Aachen machine translation system for WMT 2010

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Jane: open source hierarchical translation, extended with reordering and lexicon models

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
The RWTH system combination system for WMT 2010

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Enhancing language models in statistical machine translation with backward n-grams and mutual information triggers

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Integrating source-language context into phrase-based statistical machine translation

Machine Translation
The RWTH Aachen machine translation system for WMT 2011

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
From n-gram-based to CRF-based translation models

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
A comparison of segmentation methods and extended lexicon models for Arabic statistical machine translation

Machine Translation
Jane: an advanced freely available hierarchical machine translation toolkit

Machine Translation
Translation model adaptation for statistical machine translation with monolingual topic information

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Modeling the translation of predicate-argument structure for SMT

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
WSD for n-best reranking and local language modeling in SMT

SSST-6 '12 Proceedings of the Sixth Workshop on Syntax, Semantics and Structure in Statistical Translation
Black box features for the WMT 2012 quality estimation shared task

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
The RWTH Aachen machine translation system for WMT 2012

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Joint WMT 2012 submission of the QUAERO project

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
The Karlsruhe institute of technology translation systems for the WMT 2012

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Analysis, preparation, and optimization of statistical sign language machine translation

Machine Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this work, we propose two extensions of standard word lexicons in statistical machine translation: A discriminative word lexicon that uses sentence-level source information to predict the target words and a trigger-based lexicon model that extends IBM model 1 with a second trigger, allowing for a more fine-grained lexical choice of target words. The models capture dependencies that go beyond the scope of conventional SMT models such as phrase-and language models. We show that the models improve translation quality by 1% in BLEU over a competitive baseline on a large-scale task.