Bringing the dictionary to the user: the FOKS system
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Hi-index | 0.00 |
This paper proposes two new methods to identify the correct meaning of Japanese homonyms in text based on the noun-verb co-occurrence in a sentence which can be obtained easily from corpora. The first method uses the near co-occurrence data sets, which are constructed from the above co-occurrence relation, to select the most feasible word among homonyms in the scope of a sentence. The second uses the far co-occurrence data sets, which are constructed dynamically from the near co-occurrence data sets in the course of processing input sentences, to select the most feasible word among homonyms in the scope of a sequence of sentences. An experiment of kana-to-kanji (phonogram-to-ideograph) conversion has shown that the conversion is carried out at the accuracy rate of 79.6% per word by the first method. This accuracy rate of our method is 7.4% higher than that of the ordinary method based on the word occurrence frequency.