Computational Linguistics
Machine transliteration of names in Arabic text
SEMITIC '02 Proceedings of the ACL-02 workshop on Computational approaches to semitic languages
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
HLT-NAACL-PARALLEL '03 Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond - Volume 3
A joint source-channel model for machine transliteration
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
A generic framework for machine transliteration
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Loss-sensitive discriminative training of machine transliteration models
SRWS '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Student Research Workshop and Doctoral Consortium
Whitepaper of NEWS 2009 machine transliteration shared task
NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
Report of NEWS 2009 machine transliteration shared task
NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
Hi-index | 0.00 |
We report on our efforts as part of the shared task on the NEWS 2009 Machine Transliteration Shared Task. We applied an orthographic perceptron character edit model that we have used previously for name transliteration, enhancing it in two ways: by ranking possible transliterations according to the sum of their scores according to two models, one trained to generate left-to-right, and one right-to-left; and by constraining generated strings to be consistent with character bigrams observed in the respective language's training data. Our poor showing in the official evaluation was due to a bug in the script used to produce competition-compliant output. Subsequent evaluation shows that our approach yielded comparatively strong performance on all alphabetic language pairs we attempted.