Construction of a probabilistic hierarchical structure based on a Japanese corpus and a Japanese thesaurus

  • Authors:
  • Asuka Terai;Bin Liu;Masanori Nakagawa

  • Affiliations:
  • Tokyo Institute of Technology, Tokyo, Japan;Nissay Information Technology Co. Ltd., Tokyo, Japan;Tokyo Institute of Technology, Tokyo, Japan

  • Venue:
  • LKR'08 Proceedings of the 3rd international conference on Large-scale knowledge resources: construction and application
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The purpose of this study is to construct a probabilistic hierarchical structure of categories based on a statistical analysis of Japanese corpus data and to verify the validity of the structure by conducting a psychological experiment. At first, the co-occurrence frequencies of adjectives and nouns within modification relations were extracted from a Japanese corpus. Secondly, a probabilistic hierarchical structure was constructed based on the probability, P(category|noun), representing the category membership of the nouns, and utilizing categorization information in a thesaurus and a soft clustering method (Rose's method [1]) with co-occurrence frequencies as initial values. This method makes it possible to identify the constructed hierarchical structure. In order to examine the validity of the constructed hierarchy, a psychological experiment was conducted. The results of the experiment verified the psychological validity of the hierarchical structure.