Large scale collocation data and their application to Japanese word processor technology

  • Authors:
  • Yasuo Koyama;Masako Yasutake;Kenji Yoshimura;Kosho Shudo

  • Affiliations:
  • Fukuoka University, Fukuoka, Japan;Fukuoka University, Fukuoka, Japan;Fukuoka University, Fukuoka, Japan;Fukuoka University, Fukuoka, Japan

  • Venue:
  • COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

Word processors or computers used in Japan employ Japanese input method through keyboard stroke combined with Kana (phonetic) character to Kanji (ideographic, Chinese) character conversion technology. The key factor of Kana-to-Kanji conversion technology is how to raise the accuracy of the conversion through the homophone processing, since we have so many homophonic Kanjis. In this paper, we report the results of our Kana-to-Kanji conversion experiments which embody the homophone processing based on large scale collocation data. It is shown that approximately 135,000 collocations yield 9.1 % raise of the conversion accuracy compared with the prototype system which has no collocation data.