A Graph Modeling of Semantic Similarity between Words

Authors:
Marco A. Alvarez;SeungJin Lim
Affiliations:
Utah State University, USA;Utah State University, USA
Venue:
ICSC '07 Proceedings of the International Conference on Semantic Computing
Year:
2007

Citing 0
Cited 12

A study on similarity and relatedness using distributional and WordNet-based approaches

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Grouping product features using semi-supervised learning with soft-constraints

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Clustering product features for opinion mining

Proceedings of the fourth ACM international conference on Web search and data mining
A novel multi-aspect consistency measurement for ontologies

Journal of Web Engineering
Using properties to compare both words and clauses

KES-AMSTA'11 Proceedings of the 5th KES international conference on Agent and multi-agent systems: technologies and applications
Concept vector for semantic similarity and relatedness based on WordNet structure

Journal of Systems and Software
Weakness Finder: Find product weakness from Chinese reviews by using aspects based sentiment analysis

Expert Systems with Applications: An International Journal
A new approach to use concepts definitions for semantic relatedness measurement

AI'11 Proceedings of the 24th international conference on Advances in Artificial Intelligence
Structuring e-commerce inventory

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Artificial Intelligence
Computing term similarity by large probabilistic isA knowledge

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
A Semantic Similarity Measure between Nouns based on the Structure of Wordnet

Proceedings of International Conference on Information Integration and Web-based Applications & Services

Quantified Score

Hi-index	0.00

Visualization

Abstract

The problem of measuring the semantic similarity between pairs of words has been considered a fundamental operation in data mining and information retrieval. Nevertheless, developing a computational method capable of generating satisfactory results close to what humans would perceive is still a difficult task somewhat owed to the subjective nature of similarity. In this paper, it is presented a novel algorithm for scoring the semantic similarity (SSA) between words. Given two input words w_1 and w_2, SSA exploits their corresponding concepts, relationships, and descriptive glosses available in WordNet in order to build a rooted weighted graph G_sim. The output score is calculated by exploring the concepts present in Gsim and selecting the minimal distance between any two concepts c_1 and c)2 of w_1 and w_2 respectively. The definition of distance is a combination of: 1) the depth of the nearest common ancestor between c_1 and c_2 in G_sim, 2) the intersection of the descriptive glosses of c_1 and c_2, and 3) the shortest distance between c_1 and c_2 in G_sim. A correlation of 0.913 has been achieved between the results by SSA and the human ratings reported by Miller and Charles [15] for a dataset of 28 pairs of nouns. Furthermore, using the full dataset of 65 pairs presented by Rubenstein and Goodenough [20], the correlation between SSA results and the known human ratings is 0.903, which is higher than all other reported algorithms for the same dataset. The high correlations of SSA with human ratings suggest that SSA would be convenient in solving several data mining and information retrieval problems.