Topic models for dynamic translation model adaptation

Authors:
Vladimir Eidelman;Jordan Boyd-Graber;Philip Resnik
Affiliations:
University of Maryland, College Park, MD;University of Maryland, College Park, MD;University of Maryland, College Park, MD
Venue:
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Year:
2012

Citing 16
Cited 0

A systematic comparison of various statistical alignment models

Computational Linguistics
Latent dirichlet allocation

The Journal of Machine Learning Research
An empirical study of smoothing techniques for language modeling

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Online Passive-Aggressive Algorithms

The Journal of Machine Learning Research
BiTAM: bilingual topic AdMixture models for word alignment

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Bilingual LSA-based adaptation for statistical machine translation

Machine Translation
Language and translation model adaptation using comparable corpora

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Discriminative corpus weight estimation for machine translation

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
cdec: a decoder, alignment, and learning framework for finite-state and context-free translation models

ACLDemos '10 Proceedings of the ACL 2010 System Demonstrations
Holistic sentiment analysis across languages: multilingual supervised latent Dirichlet allocation

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Discriminative instance weighting for domain adaptation in statistical machine translation

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Two easy improvements to lexical weighting

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Domain adaptation via pseudo in-domain data selection

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Optimization strategies for online large-margin learning in machine translation

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose an approach that biases machine translation systems toward relevant translations based on topic-specific contexts, where topics are induced in an unsupervised way using topic models; this can be thought of as inducing subcorpora for adaptation without any human annotation. We use these topic distributions to compute topic-dependent lexical weighting probabilities and directly incorporate them into our translation model as features. Conditioning lexical probabilities on the topic biases translations toward topic-relevant output, resulting in significant improvements of up to 1 BLEU and 3 TER on Chinese to English translation over a strong baseline.