Minimum imputed risk: unsupervised discriminative training for machine translation

Authors:
Zhifei Li;Jason Eisner;Ziyuan Wang;Sanjeev Khudanpur;Brian Roark
Affiliations:
Google Research, Mountain View, CA;Johns Hopkins University, Baltimore, MD;Johns Hopkins University, Baltimore, MD;Johns Hopkins University, Baltimore, MD;Oregon Health & Science University, Beaverton, Oregon
Venue:
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Year:
2011

Citing 21
Cited 0

Statistical analysis with missing data

Statistical analysis with missing data
On the limited memory BFGS method for large scale optimization

Mathematical Programming: Series A and B
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Minimum error rate training in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
An end-to-end discriminative approach to machine translation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Scalable inference and training of context-rich syntactic translation models

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
A better N-best list: practical determinization of weighted finite tree automata

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Online Passive-Aggressive Algorithms

The Journal of Machine Learning Research
Hierarchical Phrase-Based Translation

Computational Linguistics
Minimum risk annealing for training log-linear models

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Lattice Minimum Bayes-Risk decoding for statistical machine translation

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Lattice-based minimum error rate training for statistical machine translation

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
11,001 new features for statistical machine translation

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Joshua: an open source toolkit for parsing-based machine translation

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Better k-best parsing

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Variational decoding for statistical machine translation

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
First- and second-order expectation semirings with applications to minimum-risk training on translation forests

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Unsupervised discriminative language model training for machine translation using simulated confusion sets

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters

Quantified Score

Hi-index	0.02

Visualization

Abstract

Discriminative training for machine translation has been well studied in the recent past. A limitation of the work to date is that it relies on the availability of high-quality in-domain bilingual text for supervised training. We present an unsupervised discriminative training framework to incorporate the usually plentiful target-language monolingual data by using a rough "reverse" translation system. Intuitively, our method strives to ensure that probabilistic "round-trip" translation from a target-language sentence to the source-language and back will have low expected loss. Theoretically, this may be justified as (discriminatively) minimizing an imputed empirical risk. Empirically, we demonstrate that augmenting supervised training with unsupervised data improves translation performance over the supervised case for both IWSLT and NIST tasks.