Semi-supervised model adaptation for statistical machine translation

Authors:
Nicola Ueffing;Gholamreza Haffari;Anoop Sarkar
Affiliations:
Interactive Language Technologies Group, National Research Council Canada, Gatineau, Canada;School of Computing Science, Simon Fraser University, Burnaby, Canada;School of Computing Science, Simon Fraser University, Burnaby, Canada
Venue:
Machine Translation
Year:
2007

Citing 14
Cited 9

Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Unsupervised word sense disambiguation rivaling supervised methods

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Minimum error rate training in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Understanding the Yarowsky Algorithm

Computational Linguistics
Statistical machine translation with word- and sentence-aligned parallel corpora

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Reranking and self-training for parser adaptation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Semi-supervised training for statistical word alignment

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Effective self-training for parsing

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Word-Level Confidence Estimation for Machine Translation

Computational Linguistics
Phrasetable smoothing for statistical machine translation

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
NRC's PORTAGE system for WMT 2007

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
N-gram posterior probabilities for statistical machine translation

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation

Domain adaptation for statistical machine translation with monolingual resources

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
An Intelligent Agent That Autonomously Learns How to Translate

WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
Context adaptation in statistical machine translation using models with exponentially decaying cache

DANLP 2010 Proceedings of the 2010 Workshop on Domain Adaptation for Natural Language Processing
Improving translation model by monolingual data

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Experiments with word alignment, normalization and clause reordering for SMT between English and German

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Domain adaptation techniques for machine translation and their evaluation in a real-world setting

Canadian AI'12 Proceedings of the 25th Canadian conference on Advances in Artificial Intelligence
Improving statistical machine translation for a resource-poor language using related resource-rich languages

Journal of Artificial Intelligence Research
Translation model adaptation for statistical machine translation with monolingual topic information

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
An intelligent Web agent that autonomously learns how to translate

Web Intelligence and Agent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Statistical machine translation systems are usually trained on large amounts of bilingual text (used to learn a translation model), and also large amounts of monolingual text in the target language (used to train a language model). In this article we explore the use of semi-supervised model adaptation methods for the effective use of monolingual data from the source language in order to improve translation quality. We propose several algorithms with this aim, and present the strengths and weaknesses of each one. We present detailed experimental evaluations on the French---English EuroParl data set and on data from the NIST Chinese---English large-data track. We show a significant improvement in translation quality on both tasks.