Extraction of hierarchies based on inclusion of co-occurring words with frequency information

  • Authors:
  • Eiko Yamamoto;Kyoko Kanzaki;Hitoshi Isahara

  • Affiliations:
  • Computational Linguistics Group, National Institute of Information and Communications Technology, Kyoto, Japan;Computational Linguistics Group, National Institute of Information and Communications Technology, Kyoto, Japan;Computational Linguistics Group, National Institute of Information and Communications Technology, Kyoto, Japan

  • Venue:
  • IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we propose a method of automatically extracting word hierarchies based on the inclusion relations of word appearance patterns in corpora. We applied the complementary similarity measure (CSM) to determine a hierarchical structure of word meanings. The CSM is a similarity measure developed for recognizing degraded machine-printed text. There are CSMs for both binary and gray-scale images. The CSM for binary images has been applied to estimate one-to-many relations, such as superordinate-subordinate relations, and to extract word hierarchies. However, the CSM for gray-scale images has not been applied to natural language processing. Here, we apply the latter to extract word hierarchies from corpora. To do this, we used frequency information for co-occurring words, which is not considered when using the CSM for binary images. We compared our hierarchies with those obtained using the CSM for binary images, and evaluated them by measuring their degree of agreement with the EDR electronic dictionary.