Limiting large-scale crawls of social networking sites

  • Authors:
  • Mainack Mondal;Bimal Viswanath;Allen Clement;Peter Druschel;Krishna P. Gummadi;Alan Mislove;Ansley Post

  • Affiliations:
  • Max Planck Institute for Software Systems (MPI-SWS), Saarbruecken/Kaiserslautern, Germany;Max Planck Institute for Software Systems (MPI-SWS), Saarbruecken/Kaiserslautern, Germany;Max Planck Institute for Software Systems (MPI-SWS), Saarbruecken/Kaiserslautern, Germany;Max Planck Institute for Software Systems (MPI-SWS), Saarbruecken/Kaiserslautern, Germany;Max Planck Institute for Software Systems (MPI-SWS), Saarbruecken/Kaiserslautern, Germany;Northeastern University, Boston, MA, USA;Max Planck Institute for Software Systems (MPI-SWS), Saarbruecken/Kaiserslautern, Germany

  • Venue:
  • Proceedings of the ACM SIGCOMM 2011 conference
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Online social networking sites (OSNs) like Facebook and Orkut contain personal data of millions of users. Many OSNs view this data as a valuable asset that is at the core of their business model. Both OSN users and OSNs have strong incentives to restrict large scale crawls of this data. OSN users want to protect their privacy and OSNs their business interest. Traditional defenses against crawlers involve rate- limiting browsing activity per user account. These defense schemes, however, are vulnerable to Sybil attacks, where a crawler creates a large number of fake user accounts. In this paper, we propose Genie, a system that can be deployed by OSN operators to defend against Sybil crawlers. Genie is based on a simple yet powerful insight: the social network itself can be leveraged to defend against Sybil crawlers. We first present Genie's design and then discuss how Genie can limit crawlers while allowing browsing of user profiles by normal users.