ACM Computing Surveys (CSUR)
Efficient identification of Web communities
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
SALSA: the stochastic approach for link-structure analysis
ACM Transactions on Information Systems (TOIS)
On Clustering Using Random Walks
FST TCS '01 Proceedings of the 21st Conference on Foundations of Software Technology and Theoretical Computer Science
Algorithms for estimating relative importance in networks
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Scaling link-based similarity search
WWW '05 Proceedings of the 14th international conference on World Wide Web
Evaluating WordNet-based Measures of Lexical Semantic Relatedness
Computational Linguistics
IEEE Transactions on Knowledge and Data Engineering
Efficient aggregation for graph summarization
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Accuracy estimate and optimization techniques for SimRank computation
Proceedings of the VLDB Endowment
WordNet: similarity - measuring the relatedness of concepts
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
P-Rank: a comprehensive structural similarity measure over information networks
Proceedings of the 18th ACM conference on Information and knowledge management
Graph clustering based on structural/attribute similarities
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
Similarity estimation between interconnected objects appears in many real-world applications and many domain-related measures have been proposed. This work proposes a new perspective on specifying the similarity between resources in linked data, and in general for vertices of a directed and attributed graph. More precisely, it is based on the combination of structural properties of a graph and attribute/value of its vertices. We compute similarities between any pair of nodes using an extension of Jaccard measure, which has the nice property of increasing when the number of matching attribute/value of those resources increase. Highly similar vertices are treated as one single node in the next step which is called a CGraph. Nodes of a CGraph represent highly similar resources in the first step and links between resources are generalized to links between clusters. We propose an extension of the structural algorithm, i.e. CRank to merge highly similar nodes in the next step. The suggested model is evaluated in a clustering procedure on our standard dataset where class label of each resource is estimated and compared with the ground-truth class label. Experimental results show that our model outperforms other clustering algorithms in terms of precision and recall rate.