On the efficiency of estimating penetrating rank on large graphs

Authors:
Weiren Yu;Jiajin Le;Xuemin Lin;Wenjie Zhang
Affiliations:
University of New South Wales & NICTA, Australia;Donghua University, China;University of New South Wales & NICTA, Australia;University of New South Wales & NICTA, Australia
Venue:
SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
Year:
2012

Citing 25
Cited 1

Sources and development of mathematical software

Sources and development of mathematical software
Matrix analysis

Matrix analysis
Cumulated gain-based evaluation of IR techniques

ACM Transactions on Information Systems (TOIS)
SimRank: a measure of structural-context similarity

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Iterative Methods for Sparse Linear Systems

Iterative Methods for Sparse Linear Systems
Scaling link-based similarity search

WWW '05 Proceedings of the 14th international conference on World Wide Web
Matrix Analysis For Scientists And Engineers

Matrix Analysis For Scientists And Engineers
SimFusion: measuring similarity using unified relationship matrix

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Estimating PageRank on graph streams

Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Simrank++: query rewriting through link analysis of the click graph

Proceedings of the VLDB Endowment
Exploiting the Block Structure of Link Graph for Efficient Similarity Computation

PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
P-Rank: a comprehensive structural similarity measure over information networks

Proceedings of the 18th ACM conference on Information and knowledge management
Graph clustering based on structural/attribute similarities

Proceedings of the VLDB Endowment
Accuracy estimate and optimization techniques for SimRank computation

The VLDB Journal — The International Journal on Very Large Data Bases
Fast computation of SimRank for static and dynamic information networks

Proceedings of the 13th International Conference on Extending Database Technology
Signed networks in social media

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Closed form solution of similarity algorithms

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Parallel SimRank computation on large graphs with iterative aggregation

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Taming computational complexity: efficient and parallel simrank optimizations on undirected graphs

WAIM'10 Proceedings of the 11th international conference on Web-age information management
ASAP: towards accurate, stable and accelerative penetrating-rank estimation on large graphs

WAIM'11 Proceedings of the 12th international conference on Web-age information management
A scalable randomized method to compute link-based similarity rank on the web graph

EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology
A space and time efficient algorithm for SimRank computation

World Wide Web
An experimental study on unsupervised graph-based word sense disambiguation

CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
On Top-k Structural Similarity Search

ICDE '12 Proceedings of the 2012 IEEE 28th International Conference on Data Engineering
On the efficiency of estimating penetrating rank on large graphs

SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management

On the efficiency of estimating penetrating rank on large graphs

SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management

Quantified Score

Hi-index	0.00

Visualization

Abstract

P-Rank (Penetrating Rank) has been suggested as a useful measure of structural similarity that takes account of both incoming and outgoing edges in ubiquitous networks. Existing work often utilizes memoization to compute P-Rank similarity in an iterative fashion, which requires cubic time in the worst case. Besides, previous methods mainly focus on the deterministic computation of P-Rank, but lack the probabilistic framework that scales well for large graphs. In this paper, we propose two efficient algorithms for computing P-Rank on large graphs. The first observation is that a large body of objects in a real graph usually share similar neighborhood structures. By merging such objects with an explicit low-rank factorization, we devise a deterministic algorithm to compute P-Rank in quadratic time. The second observation is that by converting the iterative form of P-Rank into a matrix power series form, we can leverage the random sampling approach to probabilistically compute P-Rank in linear time with provable accuracy guarantees. The empirical results on both real and synthetic datasets show that our approaches achieve high time efficiency with controlled error and outperform the baseline algorithms by at least one order of magnitude.