Efficient pagerank approximation via graph aggregation

Authors:
Andrei Z. Broder;Ronny Lempel;Farzin Maghoul;Jan Pedersen
Affiliations:
IBM T.J. Watson Research Center, Hawthorne, NY;IBM Haifa Research Lab, Haifa, Israel;Yahoo! Inc., Sunnyvale, CA;Yahoo! Inc., Sunnyvale, CA
Venue:
Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters
Year:
2004

Citing 3
Cited 10

The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Topic-sensitive PageRank

Proceedings of the 11th international conference on World Wide Web
Scaling personalized web search

WWW '03 Proceedings of the 12th international conference on World Wide Web

Exploiting the hierarchical structure for link analysis

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Distributed PageRank computation based on iterative aggregation-disaggregation methods

Proceedings of the 14th ACM international conference on Information and knowledge management
Beyond PageRank: machine learning for static ranking

Proceedings of the 15th international conference on World Wide Web
Computing trusted authority scores in peer-to-peer web search networks

AIRWeb '07 Proceedings of the 3rd international workshop on Adversarial information retrieval on the web
Mining rich session context to improve web search

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Predicting Web Page Status

Information Systems Research
Mining the “Voice of the Customer” for Business Prioritization

ACM Transactions on Intelligent Systems and Technology (TIST)
ClickRank: Learning Session-Context Models to Enrich Web Search Ranking

ACM Transactions on the Web (TWEB)
Towards a common framework for peer-to-peer web retrieval

From Integrated Publication and Information Systems to Virtual Information and Knowledge Environments
Efficient parallel computation of pagerank

ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a framework for approximating random-walk based probability distributions over Web pages using graph aggregation. We (1) partition the Web's graph into classes of quasi-equivalent vertices, (2) project the page-based random walk to be approximated onto those classes, and (3) compute the stationary probability distribution of the resulting class-based random walk. From this distribution we can quickly reconstruct a distribution on pages. Inparticular, our framework can approximate the well-known PageRank distribution by setting the classes according to the set of pages on each Web host. We experimented on a Web-graph containing over 1.4 billion pages, and were able to produce a ranking that has Spearman rank-order correlation of 0.95 with respect to PageRank. A simplistic implementation of our method required less than half the running time of a highly optimized implementation of PageRank, implying that larger speedup factors are probably possible.