Lightly supervised transliteration for machine translation

Authors:
Amit Kirschenbaum;Shuly Wintner
Affiliations:
University of Haifa, Haifa, Israel;University of Haifa, Haifa, Israel
Venue:
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Year:
2009

Citing 12
Cited 0

Algorithms for Arabic name transliteration

IBM Journal of Research and Development
The nature of statistical learning theory

The nature of statistical learning theory
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Machine transliteration

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Translating named entities using monolingual and bilingual resources

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
New Support Vector Algorithms

Neural Computation
Part-of-speech tagging of modern hebrew text

Natural Language Engineering
Active sample selection for named entity transliteration

HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Moses: open source toolkit for statistical machine translation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Translating names and technical terms in Arabic text

Semitic '98 Proceedings of the Workshop on Computational Approaches to Semitic Languages
Identification of transliterated foreign words in Hebrew script

CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a Hebrew to English transliteration method in the context of a machine translation system. Our method uses machine learning to determine which terms are to be transliterated rather than translated. The training corpus for this purpose includes only positive examples, acquired semi-automatically. Our classifier reduces more than 38% of the errors made by a baseline method. The identified terms are then transliterated. We present an SMT-based transliteration model trained with a parallel corpus extracted from Wikipedia using a fairly simple method which requires minimal knowledge. The correct result is produced in more than 76% of the cases, and in 92% of the instances it is one of the top-5 results. We also demonstrate a small improvement in the performance of a Hebrew-to-English MT system that uses our transliteration module.