The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Efficient identification of Web communities
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Structure and Interpretation of Computer Programs
Structure and Interpretation of Computer Programs
Exploiting hierarchical domain structure to compute similarity
ACM Transactions on Information Systems (TOIS)
An Information-Theoretic Definition of Similarity
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
SimRank: a measure of structural-context similarity
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Node similarity in networked information spaces
CASCON '01 Proceedings of the 2001 conference of the Centre for Advanced Studies on Collaborative research
Parallel PageRank Computation on a Gigabit PC Cluster
AINA '04 Proceedings of the 18th International Conference on Advanced Information Networking and Applications - Volume 2
Scaling link-based similarity search
WWW '05 Proceedings of the 14th international conference on World Wide Web
SimFusion: measuring similarity using unified relationship matrix
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Local Graph Partitioning using PageRank Vectors
FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
PageSim: A Novel Link-Based Similarity Measure for the World Wide Web
WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Simrank++: query rewriting through link analysis of the click graph
Proceedings of the VLDB Endowment
Accuracy estimate and optimization techniques for SimRank computation
Proceedings of the VLDB Endowment
The Mailman algorithm: A note on matrix--vector multiplication
Information Processing Letters
Analysis of community structure in Wikipedia
Proceedings of the 18th international conference on World wide web
WikiRelate! computing semantic relatedness using wikipedia
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Computing semantic relatedness using Wikipedia-based explicit semantic analysis
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Efficient parallel computation of pagerank
ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
Efficient link-based clustering in a large scaled blog network
Proceedings of the 5th International Conference on Ubiquitous Information Management and Communication
Pairwise similarity calculation of information networks
DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery
ASAP: towards accurate, stable and accelerative penetrating-rank estimation on large graphs
WAIM'11 Proceedings of the 12th international conference on Web-age information management
Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium
A space and time efficient algorithm for SimRank computation
World Wide Web
Ranking structural parameters for social networks
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications
SimFusion+: extending simfusion towards efficient estimation on large and dynamic networks
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
On the efficiency of estimating penetrating rank on large graphs
SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
Efficient simrank-based similarity join over large graphs
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
The measure of similarity between objects is a very useful tool in many areas of computer science, including information retrieval. SimRank is a simple and intuitive measure of this kind, based on a graph-theoretic model. SimRank is typically computed iteratively, in the spirit of PageRank. However, existing work on SimRank lacks accuracy estimation of iterative computation and has discouraging time complexity. In this paper, we present a technique to estimate the accuracy of computing SimRank iteratively. This technique provides a way to find out the number of iterations required to achieve a desired accuracy when computing SimRank. We also present optimization techniques that improve the computational complexity of the iterative algorithm from O(n 4) in the worst case to min(O(nl), O(n 3/ log2 n)), with n denoting the number of objects, and l denoting the number object-to-object relationships. We also introduce a threshold sieving heuristic and its accuracy estimation that further improves the efficiency of the method. As a practical illustration of our techniques, we computed SimRank scores on a subset of English Wikipedia corpus, consisting of the complete set of articles and category links.