Processing homonyms in the Kana-to-Kanji conversion

  • Authors:
  • Masahito Takahashi;Tsuyoshi Shinchu;Kenji Yoshimura;Kosho Shudo

  • Affiliations:
  • Fukuoka University, Fukuoka, Japan;Fukuoka University, Fukuoka, Japan;Fukuoka University, Fukuoka, Japan;Fukuoka University, Fukuoka, Japan

  • Venue:
  • COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes two new methods to identify the correct meaning of Japanese homonyms in text based on the noun-verb co-occurrence in a sentence which can be obtained easily from corpora. The first method uses the near co-occurrence data sets, which are constructed from the above co-occurrence relation, to select the most feasible word among homonyms in the scope of a sentence. The second uses the far co-occurrence data sets, which are constructed dynamically from the near co-occurrence data sets in the course of processing input sentences, to select the most feasible word among homonyms in the scope of a sequence of sentences. An experiment of kana-to-kanji (phonogram-to-ideograph) conversion has shown that the conversion is carried out at the accuracy rate of 79.6% per word by the first method. This accuracy rate of our method is 7.4% higher than that of the ordinary method based on the word occurrence frequency.