A fast two-stage algorithm for computing SimRank and its extensions

  • Authors:
  • Xu Jia;Hongyan Liu;Li Zou;Jun He;Xiaoyong Du

  • Affiliations:
  • Key Labs of Data Engineering and Knowledge Engineering, Ministry of Education, China and Department of Computer Science, Renmin University of China, China;Department of Management Science and Engineering, Tsinghua University, China;Key Labs of Data Engineering and Knowledge Engineering, Ministry of Education, China and Department of Computer Science, Renmin University of China, China;Key Labs of Data Engineering and Knowledge Engineering, Ministry of Education, China and Department of Computer Science, Renmin University of China, China;Key Labs of Data Engineering and Knowledge Engineering, Ministry of Education, China and Department of Computer Science, Renmin University of China, China

  • Venue:
  • WAIM'10 Proceedings of the 2010 international conference on Web-age information management
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Similarity estimation can be used in many applications such as recommender system, cluster analysis, information retrieval and link prediction. SimRank is a famous algorithm to measure objects' similarities based on link structure. We observe that if one node has no in-link, similarity score between this node and any of the others is always zero. Based on this observation, we propose a new algorithm, fast two-stage SimRank (F2S-SimRank), which can avoid storing unnecessary zeros and can accelerate the computation without accuracy loss. Under the circumstance of no accuracy loss, this algorithm uses less computation time and occupies less main memory. Experiments conducted on real and synthetic datasets demonstrate the effectiveness and efficiency of our F2S-SimRank.