Improving back-transliteration by combining information sources

  • Authors:
  • Slaven Bilac;Hozumi Tanaka

  • Affiliations:
  • Department of Computer Science, Tokyo Institute of Technology, Tokyo;Department of Computer Science, Tokyo Institute of Technology, Tokyo

  • Venue:
  • IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
  • Year:
  • 2004

Quantified Score

Hi-index 0.01

Visualization

Abstract

Transliterating words and names from one language to another is a frequent and highly productive phenomenon. Transliteration is information loosing since important distinctions are not preserved in the process. Hence, automatically converting transliterated words back into their original form is a real challenge. However, due to wide applicability in MT and CLIR, it is a computationally interesting problem. Previously proposed back-transliteration methods are based either on phoneme modeling or grapheme modeling across languages. In this paper, we propose a new method, combining the two models in order to enhance the back–transliterations of words transliterated in Japanese. Our experiments show that the resulting system outperforms single-model systems.