A hybrid back-transliteration system for Japanese

  • Authors:
  • Slaven Bilac;Hozumi Tanaka

  • Affiliations:
  • Tokyo Institute of Technology, Ookayama, Meguro, Tokyo, Japan;Tokyo Institute of Technology, Ookayama, Meguro, Tokyo, Japan

  • Venue:
  • COLING '04 Proceedings of the 20th international conference on Computational Linguistics
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Transliterating words and names from one language to another is a frequent and highly productive phenomenon. Transliteration is information losing since important distinctions are not preserved in the process. Hence, automatically converting transliterated words back into their original form is a real challenge. In addition, due to its wide applicability in MT and CLIR, it is an interesting problem from a practical point of view. In this paper, we propose a new method, combining the transliterated string segmentation module with phoneme-based and grapheme-based transliteration modules in order to enhance the back-transliterations of Japanese words. Our experiments show significant improvements achieved by the hybrid approach.