Assessing single-pair similarity over graphs by aggregating first-meeting probabilities

Authors:
Jun He;Hongyan Liu;Jeffrey Xu Yu;Pei Li;Wei He;Xiaoyong Du
Affiliations:
-;-;-;-;-;-
Venue:
Information Systems
Year:
2014

Citing 16
Cited 0

Authoritative sources in a hyperlinked environment

Journal of the ACM (JACM)
SimRank: a measure of structural-context similarity

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
A Measure of Similarity between Graph Vertices: Applications to Synonym Extraction and Web Searching

SIAM Review
Scaling link-based similarity search

WWW '05 Proceedings of the 14th international conference on World Wide Web
SimFusion: measuring similarity using unified relationship matrix

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Link mining: a survey

ACM SIGKDD Explorations Newsletter
PageSim: a novel link-based measure of web page aimilarity

Proceedings of the 15th international conference on World Wide Web
LinkClus: efficient clustering via heterogeneous semantic links

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Algorithmic Computation and Approximation of Semantic Similarity

World Wide Web
Random-Walk Computation of Similarities between Nodes of a Graph with Application to Collaborative Recommendation

IEEE Transactions on Knowledge and Data Engineering
PageRank and the random surfer model

Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Introduction to Information Retrieval

Introduction to Information Retrieval
S-SimRank: Combining Content and Link Information to Cluster Papers Effectively and Efficiently

ADMA '08 Proceedings of the 4th international conference on Advanced Data Mining and Applications
Simrank++: query rewriting through link analysis of the click graph

Proceedings of the VLDB Endowment
Accuracy estimate and optimization techniques for SimRank computation

Proceedings of the VLDB Endowment
Exploiting the Block Structure of Link Graph for Efficient Similarity Computation

PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Link-based similarity plays an important role in measuring similarities between nodes in a graph. As a widely used link-based similarity, SimRank scores similarity between two nodes as the first-meeting probability of two random surfers. However, due to the large scale of graphs in real-world applications and dynamic change characteristic, it is not viable to frequently update the whole similarity matrix. Also, people often only concern about the similarities of a small subset of nodes in a graph. In such a case, the existing approaches need to compute the similarities of all node-pairs simultaneously, suffering from high computation cost. In this paper, we propose a new algorithm, Iterative Single-Pair SimRank (ISP), based on the random surfer-pair model to compute the SimRank similarity score for a single pair of nodes in a graph. To avoid computing similarities of all other nodes, we introduce a new data structure, position matrix, to facilitate computation of the first-meeting probabilities of two random surfers, and give two optimization techniques to further enhance their performance. In addition, we theoretically prove that the time cost of ISP is always less than the original algorithm SimRank. Comprehensive experiments conducted on both synthetic and real datasets demonstrate the effectiveness and efficiency of our approach.