A large time-aware web graph

Authors:
Paolo Boldi;Massimo Santini;Sebastiano Vigna
Affiliations:
Università di Milano, Italy;Università di Milano, Italy;Università di Milano, Italy
Venue:
ACM SIGIR Forum
Year:
2008

Citing 5
Cited 10

Efficient decoding of prefix codes

Communications of the ACM
Efficient Storage and Retrieval by Content and Address of Static Files

Journal of the ACM (JACM)
The webgraph framework I: compression techniques

Proceedings of the 13th international conference on World Wide Web
UbiCrawler: a scalable fully distributed web crawler

Software—Practice & Experience
Broadword implementation of rank/select queries

WEA'08 Proceedings of the 7th international conference on Experimental algorithms

Web spam filtering in internet archives

Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web
Web spam challenge proposal for filtering in archives

Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web
Fast and Compact Web Graph Representations

ACM Transactions on the Web (TWEB)
Theory and practice of monotone minimal perfect hashing

Journal of Experimental Algorithmics (JEA)
Scalable manipulation of archival web graphs

Proceedings of the 9th workshop on Large-scale and distributed informational retrieval
Parallel and I/O efficient set covering algorithms

Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
GraphChi: large-scale graph computation on just a PC

OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Regularization-based solution of the PageRank problem for large matrices

Automation and Remote Control
DrunkardMob: billions of random walks on just a PC

Proceedings of the 7th ACM conference on Recommender systems
The energy case for graph processing on hybrid CPU and GPU systems

IA^3 '13 Proceedings of the 3rd Workshop on Irregular Applications: Architectures and Algorithms

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe the techniques developed to gather and distribute in a highly compressed, yet accessible, form a series of twelve snapshot of the .uk web domain. Ad hoc compression techniques made it possible to store the twelve snapshots using just 1:9 bits per link, with constant-time access to temporal information. Our collection makes it possible to study the temporal evolution link-based scores (e.g., PageRank), the growth of online communities, and in general time-dependent phenomena related to the link structure.