Size-estimation framework with applications to transitive closure and reachability
Journal of Computer and System Sciences
The space complexity of approximating the frequency moments
Journal of Computer and System Sciences
ANF: a fast and scalable tool for data mining in massive graphs
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
The webgraph framework I: compression techniques
Proceedings of the 13th international conference on World Wide Web
Broadword implementation of rank/select queries
WEA'08 Proceedings of the 7th international conference on Experimental algorithms
Pregel: a system for large-scale graph processing
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
HADI: Mining Radii of Large Graphs
ACM Transactions on Knowledge Discovery from Data (TKDD)
Robustness of social networks: comparative results based on distance distributions
SocInfo'11 Proceedings of the Third international conference on Social informatics
Injecting uncertainty in graphs for identity obfuscation
Proceedings of the VLDB Endowment
On computing the diameter of real-world directed (weighted) graphs
SEA'12 Proceedings of the 11th international conference on Experimental Algorithms
Proceedings of the 3rd Annual ACM Web Science Conference
Impact neighborhood indexing (INI) in diffusion graphs
Proceedings of the 21st ACM international conference on Information and knowledge management
Evolution of social-attribute networks: measurements, modeling, and implications using google+
Proceedings of the 2012 ACM conference on Internet measurement conference
Four Degrees of Separation, Really
ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
Using Pregel-like Large Scale Graph Processing Frameworks for Social Network Analysis
ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
Competition-based networks for expert finding
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
How social network is evolving?: a preliminary study on billion-scale twitter network
Proceedings of the 22nd international conference on World Wide Web companion
Scalable similarity estimation in social networks: closeness, node labels, and random edge lengths
Proceedings of the first ACM conference on Online social networks
Call me maybe: understanding nature and risks of sharing mobile numbers on online social networks
Proceedings of the first ACM conference on Online social networks
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
ACM SIGOPS 24th Symposium on Operating Systems Principles
X-Stream: edge-centric graph processing using streaming partitions
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Hi-index | 0.01 |
The neighbourhood function NG(t) of a graph G gives, for each t ∈ N, the number of pairs of nodes x, y such that y is reachable from x in less that t hops. The neighbourhood function provides a wealth of information about the graph [10] (e.g., it easily allows one to compute its diameter), but it is very expensive to compute it exactly. Recently, the ANF algorithm [10] (approximate neighbourhood function) has been proposed with the purpose of approximating NG(t) on large graphs. We describe a breakthrough improvement over ANF in terms of speed and scalability. Our algorithm, called HyperANF, uses the new HyperLogLog counters [5] and combines them efficiently through broadword programming [8]; our implementation uses talk decomposition to exploit multi-core parallelism. With HyperANF, for the first time we can compute in a few hours the neighbourhood function of graphs with billions of nodes with a small error and good confidence using a standard workstation. Then, we turn to the study of the distribution of the distances between reachable nodes (that can be efficiently approximated by means of HyperANF), and discover the surprising fact that its index of dispersion provides a clear-cut characterisation of proper social networks vs. web graphs. We thus propose the spid (Shortest-Paths Index of Dispersion) of a graph as a new, informative statistics that is able to discriminate between the above two types of graphs. We believe this is the first proposal of a significant new non-local structural index for complex networks whose computation is highly scalable.