Computing term similarity by large probabilistic isA knowledge

Authors:
Peipei Li;Haixun Wang;Kenny Q. Zhu;Zhongyuan Wang;Xindong Wu
Affiliations:
Hefei University of Technology, Hefei city, China;Microsoft Research Asia, Bei Jing, China;Shanghai Jiao Tong University, Shang Hai, China;Renmin University of China, Microsoft Research Asia, Bei Jing, China;University of Vermont, Vermont, USA
Venue:
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Year:
2013

Citing 18
Cited 0

WordNet: a lexical database for English

Communications of the ACM
Contextual correlates of synonymy

Communications of the ACM
An Information-Theoretic Definition of Similarity

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet

CICLing '02 Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing
Automatic acquisition of hyponyms from large text corpora

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Evaluating WordNet-based Measures of Lexical Semantic Relatedness

Computational Linguistics
Novel association measures using web search with double checking

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Measures of semantic similarity and relatedness in the biomedical domain

Journal of Biomedical Informatics
A Graph Modeling of Semantic Similarity between Words

ICSC '07 Proceedings of the International Conference on Semantic Computing
Personalizing PageRank for word sense disambiguation

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
A study on similarity and relatedness using distributional and WordNet-based approaches

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Using information content to evaluate semantic similarity in a taxonomy

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
Ontology-based information content computation

Knowledge-Based Systems
An ontology-based measure to compute semantic similarity in biomedicine

Journal of Biomedical Informatics
A word at a time: computing word relatedness using temporal semantic analysis

Proceedings of the 20th international conference on World wide web
A Web Search Engine-Based Approach to Measure Semantic Similarity between Words

IEEE Transactions on Knowledge and Data Engineering
Probase: a probabilistic taxonomy for text understanding

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Concept-based web search

ER'12 Proceedings of the 31st international conference on Conceptual Modeling

Quantified Score

Hi-index	0.00

Visualization

Abstract

Computing semantic similarity between two terms is essential for a variety of text analytics and understanding applications. However, existing approaches are more suitable for semantic similarity between words rather than the more general multi-word expressions (MWEs), and they do not scale very well. Therefore, we propose a lightweight and effective approach for semantic similarity using a large scale semantic network automatically acquired from billions of web documents. Given two terms, we map them into the concept space, and compare their similarity there. Furthermore, we introduce a clustering approach to orthogonalize the concept space in order to improve the accuracy of the similarity measure. Extensive studies demonstrate that our approach can accurately compute the semantic similarity between terms with MWEs and ambiguity, and significantly outperforms 12 competing methods.