A systematic comparison of various statistical alignment models
Computational Linguistics
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
HLT-NAACL-PARALLEL '03 Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond - Volume 3
Automatic generation of Japanese–English bilingual thesauri based on bilingual corpora
Journal of the American Society for Information Science and Technology - Research Articles
A joint source-channel model for machine transliteration
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Adaptive string distance measures for bilingual dialect lexicon induction
ACL '07 Proceedings of the 45th Annual Meeting of the ACL: Student Research Workshop
Bayesian semi-supervised Chinese word segmentation for statistical machine translation
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Sampling alignment structure under a Bayesian translation model
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
On the importance of pivot language selection for statistical machine translation
NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Morphological analysis and generation for Arabic dialects
Semitic '05 Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages
Evaluation of string distance algorithms for dialectology
LD '06 Proceedings of the Workshop on Linguistic Distances
Bayesian unsupervised word segmentation with nested Pitman-Yor language modeling
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
OpenFst: a general and efficient weighted finite-state transducer library
CIAA'07 Proceedings of the 12th international conference on Implementation and application of automata
Transliteration generation and mining with limited training resources
NEWS '10 Proceedings of the 2010 Named Entities Workshop
Language independent transliteration mining system using finite state automata framework
NEWS '10 Proceedings of the 2010 Named Entities Workshop
Word segmentation for dialect translation
CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part II
Comparative study on corpora for speech translation
IEEE Transactions on Audio, Speech, and Language Processing
Hi-index | 0.00 |
Recent research on multilingual statistical machine translation (SMT) focuses on the usage of pivot languages in order to overcome resource limitations for certain language pairs. This paper proposes a new method to translate a dialect language into a foreign language by integrating transliteration approaches based on Bayesian co-segmentation (BCS) models with pivot-based SMT approaches. The advantages of the proposed method with respect to standard SMT approaches are three fold: (1) it uses a standard language as the pivot language and acquires knowledge about the relation between dialects and the standard language automatically, (2) it reduces the translation task complexity by using monotone decoding techniques, (3) it reduces the number of features in the log-linear model that have to be estimated from bilingual data. Experimental results translating four Japanese dialects (Kumamoto, Kyoto, Okinawa, Osaka) into four Indo-European languages (English, German, Russian, Hindi) and two Asian languages (Chinese, Korean) revealed that the proposed method improves the translation quality of dialect translation tasks and outperforms standard pivot translation approaches concatenating SMT engines for the majority of the investigated language pairs.