Partitioning social networks for time-dependent queries

  • Authors:
  • Berenice Carrasco;Yi Lu;Joana M. F. da Trindade

  • Affiliations:
  • University of Illinois;University of Illinois;University of Illinois

  • Venue:
  • Proceedings of the 4th Workshop on Social Network Systems
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The most common type of queries in online social networks are news feeds of friends' recent activities. These queries involve the retrieval of multiple small records generated by different users in the network, and the results are time dependent. Hash-based horizontal partitioning of data results in accesses at multiple servers, which significantly affects throughput and response time. Partitioning of social network data is difficult because of the power-law degree distribution of the friendship graph, and the time dependency of queries and user activities. The power-law degree distribution results in a tremendous amount of extra storage for replication-based partitions, and the time dependency makes query-driven partitioning ineffective. We propose to partition not only the spatial network of social relations, but also in the time dimension so that users who have communicated in a given period are grouped together. We build an activity prediction graph to capture relationships with strong activity and serve as the basis for partitioning. New nodes occurring in the current period are added greedily. We test the partitioning results with emulation of Facebook page downloads, and show that our algorithm achieves 10 times better data locality than hash-based horizontal partitioning algorithms. We show the quality of activity prediction by observing that the algorithm with prediction achieves 80% data locality of that with perfect knowledge of the current period.