Proceedings of the 11th international conference on World Wide Web
An adaptive timeout algorithm for retransmission across a packet switching network
SIGCOMM '84 Proceedings of the ACM SIGCOMM symposium on Communications architectures and protocols: tutorials & symposium
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
What is Twitter, a social network or a news media?
Proceedings of the 19th international conference on World wide web
Walking in facebook: a case study of unbiased sampling of OSNs
INFOCOM'10 Proceedings of the 29th conference on Information communications
The little engine(s) that could: scaling online social networks
Proceedings of the ACM SIGCOMM 2010 conference
Proceedings of the second ACM SIGCOMM workshop on Networking, systems, and applications on mobile handhelds
Understanding latent interactions in online social networks
IMC '10 Proceedings of the 10th ACM SIGCOMM conference on Internet measurement
Estimating and sampling graphs with multidimensional random walks
IMC '10 Proceedings of the 10th ACM SIGCOMM conference on Internet measurement
Location Cheating: A Security Challenge to Location-Based Social Network Services
ICDCS '11 Proceedings of the 2011 31st International Conference on Distributed Computing Systems
Understanding Graph Sampling Algorithms for Social Network Analysis
ICDCSW '11 Proceedings of the 2011 31st International Conference on Distributed Computing Systems Workshops
Defending against large-scale crawls in online social networks
Proceedings of the 8th international conference on Emerging networking experiments and technologies
Unveiling the patterns of video tweeting: a sina weibo-based measurement study
PAM'13 Proceedings of the 14th international conference on Passive and Active Measurement
Dasu: pushing experiments to the internet's edge
nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
Hi-index | 0.00 |
The emerging research for online social networks (OSNs) requires a huge amount of data. However, OSN sites typically enforce restrictions for data crawling, such as request rate limiting on a per-IP basis. It becomes challenging for an individual research group to collect sufficient data by using its own network resources. In this paper, we introduce and motivate crowd crawling, which allows multiple research groups to efficiently crawl data in a collaborative way. Crowd crawling is carefully designed by addressing several practical challenges including resource diversity of different partners, strict request rate limiting from OSN providers, and data fidelity. We implemented and deployed a crowd crawling prototype on PlanetLab, and demonstrated its performance through evaluations. We have made the datasets crawled in our evaluation publicly available.