A study on similarity and relatedness using distributional and WordNet-based approaches
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Grouping product features using semi-supervised learning with soft-constraints
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Clustering product features for opinion mining
Proceedings of the fourth ACM international conference on Web search and data mining
A novel multi-aspect consistency measurement for ontologies
Journal of Web Engineering
Using properties to compare both words and clauses
KES-AMSTA'11 Proceedings of the 5th KES international conference on Agent and multi-agent systems: technologies and applications
Concept vector for semantic similarity and relatedness based on WordNet structure
Journal of Systems and Software
Expert Systems with Applications: An International Journal
A new approach to use concepts definitions for semantic relatedness measurement
AI'11 Proceedings of the 24th international conference on Advances in Artificial Intelligence
Structuring e-commerce inventory
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Computing text semantic relatedness using the contents and links of a hypertext encyclopedia
Artificial Intelligence
Computing term similarity by large probabilistic isA knowledge
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
A Semantic Similarity Measure between Nouns based on the Structure of Wordnet
Proceedings of International Conference on Information Integration and Web-based Applications & Services
Hi-index | 0.00 |
The problem of measuring the semantic similarity between pairs of words has been considered a fundamental operation in data mining and information retrieval. Nevertheless, developing a computational method capable of generating satisfactory results close to what humans would perceive is still a difficult task somewhat owed to the subjective nature of similarity. In this paper, it is presented a novel algorithm for scoring the semantic similarity (SSA) between words. Given two input words w_1 and w_2, SSA exploits their corresponding concepts, relationships, and descriptive glosses available in WordNet in order to build a rooted weighted graph G_sim. The output score is calculated by exploring the concepts present in Gsim and selecting the minimal distance between any two concepts c_1 and c)2 of w_1 and w_2 respectively. The definition of distance is a combination of: 1) the depth of the nearest common ancestor between c_1 and c_2 in G_sim, 2) the intersection of the descriptive glosses of c_1 and c_2, and 3) the shortest distance between c_1 and c_2 in G_sim. A correlation of 0.913 has been achieved between the results by SSA and the human ratings reported by Miller and Charles [15] for a dataset of 28 pairs of nouns. Furthermore, using the full dataset of 65 pairs presented by Rubenstein and Goodenough [20], the correlation between SSA results and the known human ratings is 0.903, which is higher than all other reported algorithms for the same dataset. The high correlations of SSA with human ratings suggest that SSA would be convenient in solving several data mining and information retrieval problems.