Algorithms for Arabic name transliteration
IBM Journal of Research and Development
A systematic comparison of various statistical alignment models
Computational Linguistics
Computational Linguistics
Extracting named entity translingual equivalence with limited resources
ACM Transactions on Asian Language Information Processing (TALIP)
Translating named entities using monolingual and bilingual resources
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Transliteration of proper names in cross-lingual information retrieval
MultiNER '03 Proceedings of the ACL 2003 workshop on Multilingual and mixed-language named entity recognition - Volume 15
Learning transliteration lexicons from the web
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
A modified joint source-channel model for transliteration
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Transliteration mining with phonetic conflation and iterative training
NEWS '10 Proceedings of the 2010 Named Entities Workshop
Improved transliteration mining using graph reinforcement
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Transliteration mining using large training and test sets
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Hi-index | 0.00 |
Machine transliteration has a number of applications in a variety of natural language processing related tasks such as machine translation, information retrieval and question-answering. For automated learning of machine transliteration, a large parallel corpus of names in two scripts is required. In this paper we present a simple yet powerful method for automatic mining of Hindi-English names from a parallel corpus. An average 93% precision and 85% recall is achieved in mining of proper names. The method works even with a small corpus. We compare our results with Giza++ word alignment tool that yields 30% precision and 63% recall on the same corpora. We also demonstrate that this very method of name mining works for other Indian languages as well.