Speeding up algorithms on compressed web graphs

  • Authors:
  • Chinmay Karande;Kumar Chellapilla;Reid Andersen

  • Affiliations:
  • Georgia Inst. Technology, Atlanta, GA;Microsoft Live Labs, Bellevue, WA;Microsoft Live Labs, Bellevue, WA

  • Venue:
  • Proceedings of the Second ACM International Conference on Web Search and Data Mining
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

A variety of lossless compression schemes have been proposed to reduce the storage requirements of web graphs. One successful approach is virtual node compression [7], in which often-used patterns of links are replaced by links to virtual nodes, creating a compressed graph that succinctly represents the original. In this paper, we show that several important classes of web graph algorithms can be extended to run directly on virtual node compressed graphs, such that their running times depend on the size of the compressed graph rather than the original. These include algorithms for link analysis, estimating the size of vertex neighborhoods, and a variety of algorithms based on matrix-vector products and random walks. Similar speed-ups have been obtained previously for classical graph algorithms like shortest paths and maximum bipartite matching. We measure the performance of our modified algorithms on several publicly available web graph datasets, and demonstrate significant empirical speedups that nearly match the compression ratios.