Inherited Feature-based Similarity Measure based on large semantic hierarchy and large text corpus

  • Authors:
  • Hideki Hirakawa;Zhonghui Xu;Kenneth Haase

  • Affiliations:
  • Toshiba R&D Center, Kawasaki, Japan;MIT Media Laboratory, Cambridge, MA;MIT Media Laboratory, Cambridge, MA

  • Venue:
  • COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe a similarity calculation model called IFSM (Inherited Feature Similarity Measure) between objects (words/concepts) based on their common and distinctive features. We propose an implementation method for obtaining features based on abstracted triples extracted from a large text corpus utilizing taxonomical knowledge. This model represents an integration of traditional methods, i.e., relation based similarity measure and distribution based similarity measure. An experiment, using our new concept abstraction method which we call the flat probability grouping method, over 80,000 surface triples, shows that the abstraction level of 3000 is a good basis for feature description.