Efficient PageRank approximation via graph aggregation

Authors:
A. Z. Broder;R. Lempel;F. Maghoul;J. Pedersen
Affiliations:
IBM T.J. Watson Research Center, USA;IBM Research Lab, Haifa, Israel;Yahoo! Inc., USA;Yahoo! Inc., USA
Venue:
Information Retrieval
Year:
2006

Citing 19
Cited 9

The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Efficient crawling through URL ordering

WWW7 Proceedings of the seventh international conference on World Wide Web 7
The connectivity server: fast access to linkage information on the Web

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Measuring index quality using random walks on the Web

WWW '99 Proceedings of the eighth international conference on World Wide Web
Graph structure in the Web

Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Stable algorithms for link analysis

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Topic-sensitive PageRank

Proceedings of the 11th international conference on World Wide Web
Template detection via data mining and its applications

Proceedings of the 11th international conference on World Wide Web
I/O-efficient techniques for computing pagerank

Proceedings of the eleventh international conference on Information and knowledge management
Using PageRank to Characterize Web Structure

COCOON '02 Proceedings of the 8th Annual International Conference on Computing and Combinatorics
Who Links to Whom: Mining Linkage between Web Sites

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Extrapolation methods for accelerating PageRank computations

WWW '03 Proceedings of the 12th international conference on World Wide Web
Scaling personalized web search

WWW '03 Proceedings of the 12th international conference on World Wide Web
Adaptive on-line page importance computation

WWW '03 Proceedings of the 12th international conference on World Wide Web
A new paradigm for ranking pages on the world wide web

WWW '03 Proceedings of the 12th international conference on World Wide Web
Adaptive ranking of web pages

WWW '03 Proceedings of the 12th international conference on World Wide Web
A taxonomy of web search

ACM SIGIR Forum
The connectivity sonar: detecting site functionality by structural patterns

Proceedings of the fourteenth ACM conference on Hypertext and hypermedia
The web as a graph: measurements, models, and methods

COCOON'99 Proceedings of the 5th annual international conference on Computing and combinatorics

On rank correlation in information retrieval evaluation

ACM SIGIR Forum
MNav: A Markov Model-Based Web Site Navigability Measure

IEEE Transactions on Software Engineering
Local approximation of pagerank and reverse pagerank

Proceedings of the 17th ACM conference on Information and knowledge management
Web Page Rank Prediction with PCA and EM Clustering

WAW '09 Proceedings of the 6th International Workshop on Algorithms and Models for the Web-Graph
Weighted Rank Correlation in Information Retrieval Evaluation

AIRS '09 Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology
Triangular and skew-symmetric splitting method for numerical solutions of Markov chains

Computers & Mathematics with Applications
Hierarchical link analysis for ranking web data

ESWC'10 Proceedings of the 7th international conference on The Semantic Web: research and Applications - Volume Part II
A Local Method for ObjectRank Estimation

Proceedings of International Conference on Information Integration and Web-based Applications & Services
Reduce and aggregate: similarity ranking in multi-categorical bipartite graphs

Proceedings of the 23rd international conference on World wide web

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a framework for approximating random-walk based probability distributions over Web pages using graph aggregation. The basic idea is to partition the graph into classes of quasi-equivalent vertices, to project the page-based random walk to be approximated onto those classes, and to compute the stationary probability distribution of the resulting class-based random walk. From this distribution we can quickly reconstruct a distribution on pages. In particular, our framework can approximate the well-known PageRank distribution by setting the classes according to the set of pages on each Web host.We experimented on a Web-graph containing over 1.4 billion pages and over 6.6 billion links from a crawl of the Web conducted by AltaVista in September 2003. We were able to produce a ranking that has Spearman rank-order correlation of 0.95 with respect to PageRank. The clock time required by a simplistic implementation of our method was less than half the time required by a highly optimized implementation of PageRank, implying that larger speedup factors are probably possible.