ε-extension Hidden Markov Models and weighted transducers for machine transliteration

Authors:
Balakrishnan Vardarajan;Delip Rao
Affiliations:
Johns Hopkins University;Johns Hopkins University
Venue:
NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
Year:
2009

Citing 6
Cited 0

Statistical transliteration for english-arabic cross language information retrieval

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Machine transliteration

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
A joint source-channel model for machine transliteration

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
A generic framework for machine transliteration

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Combining probability models and web mining models: a framework for proper name transliteration

Information Technology and Management
Whitepaper of NEWS 2009 machine transliteration shared task

NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe in detail a method for transliterating an English string to a foreign language string evaluated on five different languages, including Tamil, Hindi, Russian, Chinese, and Kannada. Our method involves deriving substring alignments from the training data and learning a weighted finite state transducer from these alignments. We define an ε-extension Hidden Markov Model to derive alignments between training pairs and a heuristic to extract the substring alignments. Our method involves only two tunable parameters that can be optimized on held-out data.