Clique partitions, graph compression and speeding-up algorithms
Journal of Computer and System Sciences
On power-law relationships of the Internet topology
Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication
ANF: a fast and scalable tool for data mining in massive graphs
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Towards Compressing Web Graphs
DCC '01 Proceedings of the Data Compression Conference
Graphs over time: densification laws, shrinking diameters and possible explanations
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
ACM SIGKDD Explorations Newsletter
Reducing large internet topologies for faster simulations
NETWORKING'05 Proceedings of the 4th IFIP-TC6 international conference on Networking Technologies, Services, and Protocols; Performance of Computer and Communication Networks; Mobile and Wireless Communication Systems
Graph evolution: Densification and shrinking diameters
ACM Transactions on Knowledge Discovery from Data (TKDD)
Sampling large Internet topologies for simulation purposes
Computer Networks: The International Journal of Computer and Telecommunications Networking
Designing clustering-based web crawling policies for search engine crawlers
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
The very small world of the well-connected
Proceedings of the nineteenth ACM conference on Hypertext and hypermedia
Comparison of online social relations in volume vs interaction: a case study of cyworld
Proceedings of the 8th ACM SIGCOMM conference on Internet measurement
The very small world of the well-connected
ACM SIGWEB Newsletter
Operators for propagating trust and their evaluation in social networks
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Centralities: capturing the fuzzy notion of importance in social graphs
Proceedings of the Second ACM EuroSys Workshop on Social Network Systems
Word-of-mouth algorithms: what you don't know will hurt you
Proceedings of the ICMI-MLMI '09 Workshop on Multimodal Sensor-Based Systems and Mobile Phones for Social Computing
Privacy-enhanced public view for social graphs
Proceedings of the 2nd ACM workshop on Social web search and mining
A Survey of Statistical Network Models
Foundations and Trends® in Machine Learning
Efficiently detecting webpage updates using samples
ICWE'07 Proceedings of the 7th international conference on Web engineering
Proceedings of the 19th international conference on World wide web
Emerging topic detection on Twitter based on temporal and social terms evaluation
Proceedings of the Tenth International Workshop on Multimedia Data Mining
Time-based sampling of social network activity graphs
Proceedings of the Eighth Workshop on Mining and Learning with Graphs
Frequent subgraph mining on a single large graph using sampling techniques
Proceedings of the Eighth Workshop on Mining and Learning with Graphs
Walking in facebook: a case study of unbiased sampling of OSNs
INFOCOM'10 Proceedings of the 29th conference on Information communications
Clustering-based incremental web crawling
ACM Transactions on Information Systems (TOIS)
Estimating and sampling graphs with multidimensional random walks
IMC '10 Proceedings of the 10th ACM SIGCOMM conference on Internet measurement
A Socratic method for validation of measurement-based networking research
Computer Communications
SocialFilter: introducing social trust to collaborative spam mitigation
CollSec'10 Proceedings of the 2010 international conference on Collaborative methods for security and privacy
Correcting for missing data in information cascades
Proceedings of the fourth ACM international conference on Web search and data mining
Truthy: mapping the spread of astroturf in microblog streams
Proceedings of the 20th international conference companion on World wide web
Opinion Leadership and Social Contagion in New Product Diffusion
Marketing Science
Towards privacy for social networks: a zero-knowledge based definition of privacy
TCC'11 Proceedings of the 8th conference on Theory of cryptography
Crawling Facebook for social network analysis purposes
Proceedings of the International Conference on Web Intelligence, Mining and Semantics
Local graph sparsification for scalable clustering
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Graph cube: on warehousing and OLAP multidimensional networks
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Walking on a graph with a magnifying glass: stratified sampling via weighted random walks
Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Albatross sampling: robust and effective hybrid vertex sampling for social graphs
HotPlanet '11 Proceedings of the 3rd ACM international workshop on MobiArch
Walking on a graph with a magnifying glass: stratified sampling via weighted random walks
ACM SIGMETRICS Performance Evaluation Review - Performance evaluation review
Link formation analysis in microblogs
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Benefits of bias: towards better characterization of network sampling
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
On sampling type distribution from heterogeneous social networks
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
SCENT: Scalable compressed monitoring of evolving multirelational social networks
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP) - Special section on ACM multimedia 2010 best paper candidates, and issue on social media
Determining the diameter of small world networks
Proceedings of the 20th ACM international conference on Information and knowledge management
Efficient retrieval of 3D building models using embeddings of attributed subgraphs
Proceedings of the 20th ACM international conference on Information and knowledge management
Rumor spreading and vertex expansion
Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms
Die free or live hard? empirical evaluation and new design for fighting evolving twitter spammers
RAID'11 Proceedings of the 14th international conference on Recent Advances in Intrusion Detection
A fast algorithm to find all high degree vertices in power law graphs
Proceedings of the 21st international conference companion on World Wide Web
Community detection in Social Media
Data Mining and Knowledge Discovery
Aiding the detection of fake accounts in large scale social online services
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
Multi-agent adaptive boosting on semi-supervised water supply clusters
Advances in Engineering Software
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Cross-domain collaboration recommendation
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Coarse-grained topology estimation via graph sampling
Proceedings of the 2012 ACM workshop on Workshop on online social networks
Space-efficient sampling from social activity streams
Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications
Sampling connected induced subgraphs uniformly at random
SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
Substructure clustering: a novel mining paradigm for arbitrary data types
SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
Making recommendations in a microblog to improve the impact of a focal user
Proceedings of the sixth ACM conference on Recommender systems
On computing the diameter of real-world directed (weighted) graphs
SEA'12 Proceedings of the 11th international conference on Experimental Algorithms
Monte Carlo MCMC: efficient inference by approximate sampling
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Sampling online social networks by random walk
Proceedings of the First ACM International Workshop on Hot Topics on Interdisciplinary Social Networks Research
Enhancing community detection using a network weighting strategy
Information Sciences: an International Journal
Bridge analysis in a Social Internetworking Scenario
Information Sciences: an International Journal
Social network analysis of virtual worlds
AMT'12 Proceedings of the 8th international conference on Active Media Technology
Sparsification and sampling of networks for collective classification
SBP'13 Proceedings of the 6th international conference on Social Computing, Behavioral-Cultural Modeling and Prediction
Crawling Social Internetworking Systems
ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
ProFID: Practical frequent items discovery in peer-to-peer networks
Future Generation Computer Systems
Estimating domain-based user influence in social networks
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Composite interests' exploration thanks to on-the-fly linked data spreading activation
Proceedings of the 24th ACM Conference on Hypertext and Social Media
Early experiences in using a domain-specific language for large-scale graph analysis
First International Workshop on Graph Data Management Experiences and Systems
Semantically sampling in heterogeneous social networks
Proceedings of the 22nd international conference on World Wide Web companion
Potential networks, contagious communities, and understanding social network structure
Proceedings of the 22nd international conference on World Wide Web
Solving the missing node problem using structure and attribute information
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Community detection in content-sharing social networks
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Detect inflated follower numbers in OSN using star sampling
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
A model for recursive propagations of reputations in social networks
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
What do large networks look like?
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Discovery hub: on-the-fly linked data exploratory search
Proceedings of the 9th International Conference on Semantic Systems
Random walk-based graphical sampling in unbalanced heterogeneous bipartite social graphs
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Specialization, homophily, and gender in a social curation site: findings from pinterest
Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing
Mixing local and global information for community detection in large networks
Journal of Computer and System Sciences
Piggybacking on social networks
Proceedings of the VLDB Endowment
Personalized emerging topic detection based on a term aging model
ACM Transactions on Intelligent Systems and Technology (TIST) - Special Section on Intelligent Mobile Knowledge Discovery and Management Systems and Special Issue on Social Web Mining
Moving from social networks to social internetworking scenarios: The crawling perspective
Information Sciences: an International Journal
Leveraging Social Feedback to Verify Online Identity Claims
ACM Transactions on the Web (TWEB)
Prediction in a microblog hybrid network using bonacich potential
Proceedings of the 7th ACM international conference on Web search and data mining
PREDIcT: towards predicting the runtime of large scale iterative analytics
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
Given a huge real graph, how can we derive a representative sample? There are many known algorithms to compute interesting measures (shortest paths, centrality, betweenness, etc.), but several of them become impractical for large graphs. Thus graph sampling is essential.The natural questions to ask are (a) which sampling method to use, (b) how small can the sample size be, and (c) how to scale up the measurements of the sample (e.g., the diameter), to get estimates for the large graph. The deeper, underlying question is subtle: how do we measure success?.We answer the above questions, and test our answers by thorough experiments on several, diverse datasets, spanning thousands nodes and edges. We consider several sampling methods, propose novel methods to check the goodness of sampling, and develop a set of scaling laws that describe relations between the properties of the original and the sample.In addition to the theoretical contributions, the practical conclusions from our work are: Sampling strategies based on edge selection do not perform well; simple uniform random node selection performs surprisingly well. Overall, best performing methods are the ones based on random-walks and "forest fire"; they match very accurately both static as well as evolutionary graph patterns, with sample sizes down to about 15% of the original graph.