Sources and development of mathematical software
Sources and development of mathematical software
Matrix analysis
Cumulated gain-based evaluation of IR techniques
ACM Transactions on Information Systems (TOIS)
SimRank: a measure of structural-context similarity
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Iterative Methods for Sparse Linear Systems
Iterative Methods for Sparse Linear Systems
Scaling link-based similarity search
WWW '05 Proceedings of the 14th international conference on World Wide Web
Matrix Analysis For Scientists And Engineers
Matrix Analysis For Scientists And Engineers
SimFusion: measuring similarity using unified relationship matrix
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Estimating PageRank on graph streams
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Simrank++: query rewriting through link analysis of the click graph
Proceedings of the VLDB Endowment
Exploiting the Block Structure of Link Graph for Efficient Similarity Computation
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
P-Rank: a comprehensive structural similarity measure over information networks
Proceedings of the 18th ACM conference on Information and knowledge management
Graph clustering based on structural/attribute similarities
Proceedings of the VLDB Endowment
Accuracy estimate and optimization techniques for SimRank computation
The VLDB Journal — The International Journal on Very Large Data Bases
Fast computation of SimRank for static and dynamic information networks
Proceedings of the 13th International Conference on Extending Database Technology
Signed networks in social media
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Closed form solution of similarity algorithms
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Parallel SimRank computation on large graphs with iterative aggregation
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Taming computational complexity: efficient and parallel simrank optimizations on undirected graphs
WAIM'10 Proceedings of the 11th international conference on Web-age information management
ASAP: towards accurate, stable and accelerative penetrating-rank estimation on large graphs
WAIM'11 Proceedings of the 12th international conference on Web-age information management
A scalable randomized method to compute link-based similarity rank on the web graph
EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology
A space and time efficient algorithm for SimRank computation
World Wide Web
An experimental study on unsupervised graph-based word sense disambiguation
CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
On Top-k Structural Similarity Search
ICDE '12 Proceedings of the 2012 IEEE 28th International Conference on Data Engineering
On the efficiency of estimating penetrating rank on large graphs
SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
On the efficiency of estimating penetrating rank on large graphs
SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
Hi-index | 0.00 |
P-Rank (Penetrating Rank) has been suggested as a useful measure of structural similarity that takes account of both incoming and outgoing edges in ubiquitous networks. Existing work often utilizes memoization to compute P-Rank similarity in an iterative fashion, which requires cubic time in the worst case. Besides, previous methods mainly focus on the deterministic computation of P-Rank, but lack the probabilistic framework that scales well for large graphs. In this paper, we propose two efficient algorithms for computing P-Rank on large graphs. The first observation is that a large body of objects in a real graph usually share similar neighborhood structures. By merging such objects with an explicit low-rank factorization, we devise a deterministic algorithm to compute P-Rank in quadratic time. The second observation is that by converting the iterative form of P-Rank into a matrix power series form, we can leverage the random sampling approach to probabilistically compute P-Rank in linear time with provable accuracy guarantees. The empirical results on both real and synthetic datasets show that our approaches achieve high time efficiency with controlled error and outperform the baseline algorithms by at least one order of magnitude.