Inherited Feature-based Similarity Measure based on large semantic hierarchy and large text corpus

Authors:
Hideki Hirakawa;Zhonghui Xu;Kenneth Haase
Affiliations:
Toshiba R&D Center, Kawasaki, Japan;MIT Media Laboratory, Cambridge, MA;MIT Media Laboratory, Cambridge, MA
Venue:
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Year:
1996

Citing 5
Cited 2

Word sense disambiguation for free-text indexing using a massive semantic network

CIKM '93 Proceedings of the second international conference on Information and knowledge management
Explorations in Automatic Thesaurus Discovery

Explorations in Automatic Thesaurus Discovery
Word association norms, mutual information, and lexicography

ACL '89 Proceedings of the 27th annual meeting on Association for Computational Linguistics
Noun classification from predicate-argument structures

ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
Using information content to evaluate semantic similarity in a taxonomy

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1

An empirical assessment of semantic interpretation

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Extracting semantic clusters from the alignment of definitions

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe a similarity calculation model called IFSM (Inherited Feature Similarity Measure) between objects (words/concepts) based on their common and distinctive features. We propose an implementation method for obtaining features based on abstracted triples extracted from a large text corpus utilizing taxonomical knowledge. This model represents an integration of traditional methods, i.e., relation based similarity measure and distribution based similarity measure. An experiment, using our new concept abstraction method which we call the flat probability grouping method, over 80,000 surface triples, shows that the abstraction level of 3000 is a good basis for feature description.