Clique partitions, graph compression and speeding-up algorithms
Journal of Computer and System Sciences
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Trawling the Web for emerging cyber-communities
WWW '99 Proceedings of the eighth international conference on World Wide Web
Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
The stochastic approach for link-structure analysis (SALSA) and the TKC effect
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Representing Graph Metrics with Fewest Edges
STACS '03 Proceedings of the 20th Annual Symposium on Theoretical Aspects of Computer Science
Scaling personalized web search
WWW '03 Proceedings of the 12th international conference on World Wide Web
Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search
IEEE Transactions on Knowledge and Data Engineering
The WebGraph Framework II: Codes For The World-Wide Web
DCC '04 Proceedings of the Conference on Data Compression
The webgraph framework I: compression techniques
Proceedings of the 13th international conference on World Wide Web
UbiCrawler: a scalable fully distributed web crawler
Software—Practice & Experience
A uniform approach to accelerated PageRank computation
WWW '05 Proceedings of the 14th international conference on World Wide Web
Proceedings of the 15th international conference on World Wide Web
Transductive link spam detection
AIRWeb '07 Proceedings of the 3rd international workshop on Adversarial information retrieval on the web
A decentralized algorithm for spectral analysis
Journal of Computer and System Sciences
A scalable pattern mining approach to web graph compression with communities
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
An Inner-Outer Iteration for Computing PageRank
SIAM Journal on Scientific Computing
SWORD: scalable workload-aware data placement for transactional workloads
Proceedings of the 16th International Conference on Extending Database Technology
Hi-index | 0.00 |
A variety of lossless compression schemes have been proposed to reduce the storage requirements of web graphs. One successful approach is virtual node compression [7], in which often-used patterns of links are replaced by links to virtual nodes, creating a compressed graph that succinctly represents the original. In this paper, we show that several important classes of web graph algorithms can be extended to run directly on virtual node compressed graphs, such that their running times depend on the size of the compressed graph rather than the original. These include algorithms for link analysis, estimating the size of vertex neighborhoods, and a variety of algorithms based on matrix-vector products and random walks. Similar speed-ups have been obtained previously for classical graph algorithms like shortest paths and maximum bipartite matching. We measure the performance of our modified algorithms on several publicly available web graph datasets, and demonstrate significant empirical speedups that nearly match the compression ratios.