IEEE Transactions on Pattern Analysis and Machine Intelligence
Models of translational equivalence among words
Computational Linguistics
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Bitext maps and alignment via pattern recognition
Computational Linguistics
Machine translation of very close languages
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Dialect MT: a case study between Cantonese and Mandarin
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
HMM-based word alignment in statistical translation
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Multipath translation lexicon induction via bridge languages
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Statistical phrase-based translation
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Cognates can improve statistical translation models
NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
Minimum error rate training in statistical machine translation
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
An improved error model for noisy channel spelling correction
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
HLT '93 Proceedings of the workshop on Human Language Technology
Learning a translation lexicon from monolingual corpora
ULA '02 Proceedings of the ACL-02 workshop on Unsupervised lexical acquisition - Volume 9
Inducing translation lexicons via diverse similarity measures and bridge languages
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
A hierarchical phrase-based model for statistical machine translation
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Dependency treelet translation: syntactically informed phrasal SMT
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Clause restructuring for statistical machine translation
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Paraphrasing with bilingual parallel corpora
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Estimating class priors in domain adaptation for word sense disambiguation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Improved statistical machine translation using paraphrases
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Named entity transliteration and discovery from multilingual comparable corpora
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
A two-stage approach to domain adaptation for statistical classifiers
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Semi-supervised model adaptation for statistical machine translation
Machine Translation
Pivot language approach for phrase-based statistical machine translation
Machine Translation
Moses: open source toolkit for statistical machine translation
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Adaptive string distance measures for bilingual dialect lexicon induction
ACL '07 Proceedings of the 45th Annual Meeting of the ACL: Student Research Workshop
Improved Statistical Machine Translation Using Monolingual Paraphrases
Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Induction of cross-language affix and letter sequence correspondence
CrossLangInduction '06 Proceedings of the International Workshop on Cross-Language Knowledge Induction
Tagging Portuguese with a Spanish tagger using cognates
CrossLangInduction '06 Proceedings of the International Workshop on Cross-Language Knowledge Induction
Word lattices for multi-source translation
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Syntactic constraints on paraphrases extracted from parallel corpora
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Language and translation model adaptation using comparable corpora
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
11,001 new features for statistical machine translation
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
On the importance of pivot language selection for statistical machine translation
NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Domain adaptation for statistical classifiers
Journal of Artificial Intelligence Research
A comparison of different machine transliteration models
Journal of Artificial Intelligence Research
CCG supertags in factored statistical machine translation
StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Improving Arabic-Chinese statistical machine translation using English as pivot language
StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Word sense disambiguation with distribution estimation
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Context-based approach for pivot translation services
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
METEOR-NEXT and the METEOR paraphrase tables: improved evaluation support for five target languages
WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Proceedings of the 2010 Named Entities Workshop
NEWS '10 Proceedings of the 2010 Named Entities Workshop
Local lexical adaptation in machine translation through triangulation: SMT helping SMT
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Learning tractable word alignment models with complex constraints
Computational Linguistics
Lexical normalisation of short text messages: makn sens a #twitter
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Domain adaptation for machine translation by mining unseen words
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Dialectal to standard Arabic paraphrasing to improve Arabic-English statistical machine translation
DIALECTS '11 Proceedings of the First Workshop on Algorithms and Resources for Modelling of Dialects and Language Varieties
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Source language adaptation for resource-poor machine translation
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Hi-index | 0.00 |
We propose a novel language-independent approach for improving machine translation for resource-poor languages by exploiting their similarity to resource-rich ones. More precisely, we improve the translation from a resource-poor source language X1 into a resourcerich language Y given a bi-text containing a limited number of parallel sentences for X1-Y and a larger bi-text for X2-Y for some resource-rich language X2 that is closely related to X1. This is achieved by taking advantage of the opportunities that vocabulary overlap and similarities between the languages X1 and X2 in spelling, word order, and syntax offer: (1) we improve the word alignments for the resource-poor language, (2) we further augment it with additional translation options, and (3) we take care of potential spelling differences through appropriate transliteration. The evaluation for Indonesian → English using Malay and for Spanish → English using Portuguese and pretending Spanish is resource-poor shows an absolute gain of up to 1.35 and 3.37 BLEU points, respectively, which is an improvement over the best rivaling approaches, while using much less additional data. Overall, our method cuts the amount of necessary "real" training data by a factor of 2-5.