Report of NEWS 2010 transliteration mining shared task
NEWS '10 Proceedings of the 2010 Named Entities Workshop
An algorithm for unsupervised transliteration mining with an application to word alignment
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
A statistical model for unsupervised and semi-supervised transliteration mining
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
A Bayesian Alignment Approach to Transliteration Mining
ACM Transactions on Asian Language Information Processing (TALIP)
Hi-index | 0.00 |
Transliteration is generally defined as phonetic translation of names across languages. Machine Transliteration is a critical technology in many domains, such as machine translation, cross-language information retrieval/extraction, etc. Recent research has shown that high quality machine transliteration systems may be developed in a language-neutral manner, using a reasonably sized good quality corpus (~15--25K parallel names) between a given pair of languages. In this shared task, we focus on acquisition of such good quality names corpora in many languages, thus complementing the machine transliteration shared task that is concurrently conducted in the same NEWS 2010 workshop. Specifically, this task focuses on mining the Wikipedia paired entities data (aka, inter-wiki-links) to produce high-quality transliteration data that may be used for transliteration tasks.