Statistical transliteration for english-arabic cross language information retrieval
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
A joint source-channel model for machine transliteration
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
A generic framework for machine transliteration
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Combining probability models and web mining models: a framework for proper name transliteration
Information Technology and Management
Whitepaper of NEWS 2009 machine transliteration shared task
NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
Hi-index | 0.00 |
We describe in detail a method for transliterating an English string to a foreign language string evaluated on five different languages, including Tamil, Hindi, Russian, Chinese, and Kannada. Our method involves deriving substring alignments from the training data and learning a weighted finite state transducer from these alignments. We define an ε-extension Hidden Markov Model to derive alignments between training pairs and a heuristic to extract the substring alignments. Our method involves only two tunable parameters that can be optimized on held-out data.