Computational Linguistics
Kernel Methods for Pattern Analysis
Kernel Methods for Pattern Analysis
Weakly supervised named entity transliteration and discovery from multilingual comparable corpora
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Whitepaper of NEWS 2010 shared task on transliteration mining
NEWS '10 Proceedings of the 2010 Named Entities Workshop
Transliteration generation and mining with limited training resources
NEWS '10 Proceedings of the 2010 Named Entities Workshop
Transliteration mining with phonetic conflation and iterative training
NEWS '10 Proceedings of the 2010 Named Entities Workshop
Language independent transliteration mining system using finite state automata framework
NEWS '10 Proceedings of the 2010 Named Entities Workshop
Mining transliterations from Wikipedia using pair HMMs
NEWS '10 Proceedings of the 2010 Named Entities Workshop
Improved transliteration mining using graph reinforcement
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Transliteration mining using large training and test sets
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Leveraging supplemental representations for sequential transduction
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
A statistical model for unsupervised and semi-supervised transliteration mining
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
A Bayesian Alignment Approach to Transliteration Mining
ACM Transactions on Asian Language Information Processing (TALIP)
Hi-index | 0.00 |
This report documents the details of the Transliteration Mining Shared Task that was run as a part of the Named Entities Workshop (NEWS 2010), an ACL 2010 workshop. The shared task featured mining of name transliterations from the paired Wikipedia titles in 5 different language pairs, specifically, between English and one of Arabic, Chinese, Hindi Russian and Tamil. Totally 5 groups took part in this shared task, participating in multiple mining tasks in different languages pairs. The methodology and the data sets used in this shared task are published in the Shared Task White Paper [Kumaran et al, 2010]. We measure and report 3 metrics on the submitted results to calibrate the performance of individual systems on a commonly available Wikipedia dataset. We believe that the significant contribution of this shared task is in (i) assembling a diverse set of participants working in the area of transliteration mining, (ii) creating a baseline performance of transliteration mining systems in a set of diverse languages using commonly available Wikipedia data, and (iii) providing a basis for meaningful comparison and analysis of trade-offs between various algorithmic approaches used in mining. We believe that this shared task would complement the NEWS 2010 transliteration generation shared task, in enabling development of practical systems with a small amount of seed data in a given pair of languages.