Generating diverse katakana variants based on phonemic mapping

Authors:
Kazuhiro Seki;Hiroyuki Hattori;Kuniaki Uehara
Affiliations:
Kobe University, Kobe, Japan;Google Inc., Shibuya, Japan;Kobe University, Kobe, Japan
Venue:
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2008

Citing 1
Cited 0

Machine transliteration

Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

In Japanese, it is quite common for the same word to be written in several different ways. This is especially true for katakana words which are typically used for transliterating foreign languages. This ambiguity becomes critical for automatic processing such as information retrieval (IR). To tackle this problem, we propose a simple but effective approach to generating katakana variants by considering phonemic representation of the original language for a given word. The proposed approach is evaluated through an assessment of the variants it generates. Also, the impact of the generated variants on IR is studied in comparison to an existing approach using katakana rewriting rules.