Domain adaptation in statistical machine translation with mixture modelling

Authors:
Jorge Civera;Alfons Juan
Affiliations:
Universidad Politécnica de Valencia, Valencia, Spain;Universidad Politécnica de Valencia, Valencia, Spain
Venue:
StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Year:
2007

Citing 7
Cited 19

A systematic comparison of various statistical alignment models

Computational Linguistics
HMM-based word alignment in statistical translation

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Extensions to HMM-based statistical word alignment models

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Going beyond AER: an extensive analysis of word alignments and their impact on MT

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
HMM word and phrase alignment for statistical machine translation

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
BiTAM: bilingual topic AdMixture models for word alignment

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Moses: open source toolkit for statistical machine translation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions

Domain adaptation for statistical machine translation with domain dictionary and monolingual corpora

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
(Meta-) evaluation of machine translation

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Improving word alignment with language model based confidence scores

StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Dynamic model interpolation for statistical machine translation

StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Domain adaptation for statistical machine translation with monolingual resources

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Translation model adaptation by resampling

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Context adaptation in statistical machine translation using models with exponentially decaying cache

DANLP 2010 Proceedings of the 2010 Workshop on Domain Adaptation for Natural Language Processing
Discriminative instance weighting for domain adaptation in statistical machine translation

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Bayesian adaptation for statistical machine translation

SSPR&SPR'10 Proceedings of the 2010 joint IAPR international conference on Structural, syntactic, and statistical pattern recognition
Log-linear weight optimisation via Bayesian adaptation in statistical machine translation

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Domain adaptation for machine translation by mining unseen words

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Going beyond Google Translate?

Proceedings of the 9th ACM SIGCHI Italian Chapter International Conference on Computer-Human Interaction: Facing Complexity
A statistical medical summary translation system

Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium
Investigations on translation model adaptation using monolingual data

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Domain adaptation techniques for machine translation and their evaluation in a real-world setting

Canadian AI'12 Proceedings of the 25th Canadian conference on Advances in Artificial Intelligence
Translation model adaptation for statistical machine translation with monolingual topic information

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Mixing multiple translation models in statistical machine translation

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
The TALP-UPC phrase-based translation systems for WMT12: morphology simplification and domain adaptation

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Analysing the effect of out-of-domain data on SMT systems

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Mixture modelling is a standard technique for density estimation, but its use in statistical machine translation (SMT) has just started to be explored. One of the main advantages of this technique is its capability to learn specific probability distributions that better fit subsets of the training dataset. This feature is even more important in SMT given the difficulties to translate polysemic terms whose semantic depends on the context in which that term appears. In this paper, we describe a mixture extension of the HMM alignment model and the derivation of Viterbi alignments to feed a state-of-the-art phrase-based system. Experiments carried out on the Europarl and News Commentary corpora show the potential interest and limitations of mixture modelling.