Parallel crawling for online social networks

Authors:
Duen Horng Chau;Shashank Pandit;Samuel Wang;Christos Faloutsos
Affiliations:
Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA
Venue:
Proceedings of the 16th international conference on World Wide Web
Year:
2007

Citing 2
Cited 13

Parallel crawlers

Proceedings of the 11th international conference on World Wide Web
Vizster: Visualizing Online Social Networks

INFOVIS '05 Proceedings of the Proceedings of the 2005 IEEE Symposium on Information Visualization

High-performance priority queues for parallel crawlers

Proceedings of the 10th ACM workshop on Web information and data management
Video interactions in online video social networks

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Analysis of privacy in online social networks of runet

Proceedings of the 3rd international conference on Security of information and networks
Abusing social networks for automated user profiling

RAID'10 Proceedings of the 13th international conference on Recent advances in intrusion detection
Crawling Facebook for social network analysis purposes

Proceedings of the International Conference on Web Intelligence, Mining and Semantics
Multi agent system for historical information retrieval from online social networks

KES-AMSTA'11 Proceedings of the 5th KES international conference on Agent and multi-agent systems: technologies and applications
Crawling rich internet applications: the state of the art

CASCON '12 Proceedings of the 2012 Conference of the Center for Advanced Studies on Collaborative Research
Bridge analysis in a Social Internetworking Scenario

Information Sciences: an International Journal
Multi agent system approach for vulnerability analysis of online social network profiles over time

International Journal of Knowledge and Web Intelligence
Crawling Social Internetworking Systems

ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
CUVIM: extracting fresh information from social network

WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
Moving from social networks to social internetworking scenarios: The crawling perspective

Information Sciences: an International Journal
Making social interactions accessible in online social networks

Information Services and Use - Mining the Digital Information Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

Given a huge online social network, how do we retrieve information from it through crawling? Even better, how do we improve the crawling performance by using parallel crawlers that work independently? In this paper, we present the framework of parallel crawlers for online social networks, utilizing a centralized queue. To show how this works in practice, we describe our implementation of the crawlers for an online auction website. The crawlers work independently, therefore the failing of one crawler does not affect the others at all. The framework ensures that no redundant crawling would occur. Using the crawlers that we built, we visited a total of approximately 11 million auction users, about 66,000 of which were completely crawled.