BiTAM: bilingual topic AdMixture models for word alignment

Authors:
Bing Zhao;Eric P. Xing
Affiliations:
Carnegie Mellon University;Carnegie Mellon University
Venue:
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Year:
2006

Citing 10
Cited 15

Adaptive Parallel Sentences Mining from Web Bilingual News Collection

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Latent dirichlet allocation

The Journal of Machine Learning Research
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
An efficient method for determining bilingual word classes

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
The KANT system: fast, accurate, high-quality translation in practical domains

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 3
HMM-based word alignment in statistical translation

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
A syntax-based statistical translation model

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Phrasal cohesion and statistical machine translation

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Bilingual word spectral clustering for statistical machine translation

ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts

Bilingual LSA-based adaptation for statistical machine translation

Machine Translation
Domain adaptation in statistical machine translation with mixture modelling

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Cross-language linking of news stories on the web using interlingual topic modelling

Proceedings of the 2nd ACM workshop on Social web search and mining
Multilingual topic models for unaligned text

UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Cross-lingual latent topic extraction

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Word alignment with synonym regularization

ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Holistic sentiment analysis across languages: multilingual supervised latent Dirichlet allocation

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Bayesian word alignment for statistical machine translation

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Knowledge transfer across multilingual corpora via latent topics

PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
Cache-based document-level statistical machine translation

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Latent association analysis of document pairs

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Translation model adaptation for statistical machine translation with monolingual topic information

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
A topic similarity model for hierarchical phrase-based translation

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Topic models for dynamic translation model adaptation

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Discovering a lexicon of parts and attributes

ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume Part III

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a novel bilingual topical admixture (BiTAM) formalism for word alignment in statistical machine translation. Under this formalism, the parallel sentence-pairs within a document-pair are assumed to constitute a mixture of hidden topics; each word-pair follows a topic-specific bilingual translation model. Three BiTAM models are proposed to capture topic sharing at different levels of linguistic granularity (i.e., at the sentence or word levels). These models enable word-alignment process to leverage topical contents of document-pairs. Efficient variational approximation algorithms are designed for inference and parameter estimation. With the inferred latent topics, BiTAM models facilitate coherent pairing of bilingual linguistic entities that share common topical aspects. Our preliminary experiments show that the proposed models improve word alignment accuracy, and lead to better translation quality.