Estimating and sampling graphs with multidimensional random walks

Authors:
Bruno Ribeiro;Don Towsley
Affiliations:
University of Massachusetts Amherst, Amherst, MA, USA;University of Massachusetts Amherst, Amherst, MA, USA
Venue:
IMC '10 Proceedings of the 10th ACM SIGCOMM conference on Internet measurement
Year:
2010

Citing 18
Cited 14

Lifting Markov chains to speed up mixing

STOC '99 Proceedings of the thirty-first annual ACM symposium on Theory of computing
On near-uniform URL sampling

Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Parallel crawlers

Proceedings of the 11th international conference on World Wide Web
Monte Carlo Statistical Methods (Springer Texts in Statistics)

Monte Carlo Statistical Methods (Springer Texts in Statistics)
Random walks in peer-to-peer networks: algorithms and evaluation

Performance Evaluation - P2P computing systems
Peer counting and sampling in overlay networks: random walk methods

Proceedings of the twenty-fifth annual ACM symposium on Principles of distributed computing
Sampling from large graphs

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Random walk based node sampling in self-organizing networks

ACM SIGOPS Operating Systems Review
Introduction to Discrete Event Systems

Introduction to Discrete Event Systems
Optimizing random walk search algorithms in P2P networks

Computer Networks: The International Journal of Computer and Telecommunications Networking
Measurement and analysis of online social networks

Proceedings of the 7th ACM SIGCOMM conference on Internet measurement
Attacking a Swarm with a Band of Liars: evaluating the impact of attacks on BitTorrent

P2P '07 Proceedings of the Seventh IEEE International Conference on Peer-to-Peer Computing
The power of choice in random walks: An empirical study

Computer Networks: The International Journal of Computer and Telecommunications Networking
Statistical properties of community structure in large social and information networks

Proceedings of the 17th international conference on World Wide Web
Random sampling from a search engine's index

Journal of the ACM (JACM)
On the bias of traceroute sampling: Or, power-law degree distributions in regular graphs

Journal of the ACM (JACM)
Markov Chains and Stochastic Stability

Markov Chains and Stochastic Stability
On unbiased sampling for unstructured peer-to-peer networks

IEEE/ACM Transactions on Networking (TON)

Walking on a graph with a magnifying glass: stratified sampling via weighted random walks

Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Characterizing continuous-time random walks on dynamic networks

Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Albatross sampling: robust and effective hybrid vertex sampling for social graphs

HotPlanet '11 Proceedings of the 3rd ACM international workshop on MobiArch
Walking on a graph with a magnifying glass: stratified sampling via weighted random walks

ACM SIGMETRICS Performance Evaluation Review - Performance evaluation review
Characterizing continuous-time random walks on dynamic networks

ACM SIGMETRICS Performance Evaluation Review - Performance evaluation review
Counting YouTube videos via random prefix sampling

Proceedings of the 2011 ACM SIGCOMM conference on Internet measurement conference
Characterizing continuous time random walks on time varying graphs

Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems
Beyond random walk and metropolis-hastings samplers: why you should not backtrack for unbiased graph sampling

Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems
Social sampling

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Coarse-grained topology estimation via graph sampling

Proceedings of the 2012 ACM workshop on Workshop on online social networks
Space-efficient sampling from social activity streams

Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications
Sampling connected induced subgraphs uniformly at random

SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
Estimating clustering coefficients and size of social networks via random walk

Proceedings of the 22nd international conference on World Wide Web
Crowd crawling: towards collaborative data collection for large-scale online social networks

Proceedings of the first ACM conference on Online social networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

Estimating characteristics of large graphs via sampling is a vital part of the study of complex networks. Current sampling methods such as (independent) random vertex and random walks are useful but have drawbacks. Random vertex sampling may require too many resources (time, bandwidth, or money). Random walks, which normally require fewer resources per sample, can suffer from large estimation errors in the presence of disconnected or loosely connected graphs. In this work we propose a new m-dimensional random walk that uses m dependent random walkers. We show that the proposed sampling method, which we call Frontier sampling, exhibits all of the nice sampling properties of a regular random walk. At the same time, our simulations over large real world graphs show that, in the presence of disconnected or loosely connected components, Frontier sampling exhibits lower estimation errors than regular random walks. We also show that Frontier sampling is more suitable than random vertex sampling to sample the tail of the degree distribution of the graph.