Index Design for Dynamic Personalized PageRank

  • Authors:
  • Amit Pathak;Soumen Chakrabarti;Manish Gupta

  • Affiliations:
  • IIT Bombay;IIT Bombay. soumen@cse.iitb.ac.in;IIT Bombay

  • Venue:
  • ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Personalized page rank, related to random walks with restarts and conductance in resistive networks, is a frequent search paradigm for graph-structured databases. While efficient batch algorithms exist for static whole-graph page rank, interactive query-time personalized page rank has proved more challenging. Here we describe how to select and build indices for a popular class of page rank algorithms, so as to provide real-time personalized page rank and smoothly trade off between index size, preprocessing time, and query speed. We achieve this by developing a precise, yet efficiently estimated performance model for personalized page rank query execution. We use this model in conjunction with a query workload in a cost-benefit type index optimizer. On millions of queries from CiteSeer and its data graphs with 74-320 thousand nodes, our algorithm runs 50-400 x faster than whole-graph page rank, the gap growing with graph size. Index size is 10-20% of a text index. Ranking accuracy is above 94%.