Latency-aware data partitioning for geo-replicated online social networks

  • Authors:
  • Lei Jiao;Tianyin Xu;Jun Li;Xiaoming Fu

  • Affiliations:
  • University of Goettingen;U.C. San Diego;University of Oregon;University of Goettingen

  • Venue:
  • Proceedings of the Workshop on Posters and Demos Track
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Large-Scale Online Social Networks (OSNs) usually employ data replication across multiple datacenters in multiple geo-locations to ensure high availability and performance [1]. The de facto method for data replication in current OSNs (e.g., Facebook) is full replication which enables each geo-distributed datacenter to maintain one copy of all the data. The full replication method can simply achieve good performance but poses high overhead for maintenance (e.g., replica storage and synchronization). Firstly, full replication leads to linear storage growth with the increasing of datacenter deployment, which is of poor scalability. Secondly, the data replicas across all the locations requires synchronization, resulting in large inter-datacenter WAN traffic which is very expensive. The ideal solution is to partition user data across multiple datacenters, making each geo-distributed datacenter to maintain one partition of the whole data set. Unfortunately, partitioning OSN data by tradition graph algorithms is known to be very difficult due to the high interconnection and inter-dependency within the OSN data [2]. Besides, geo-partitioning goes beyond the traditional graph partitioning problems because the user-perceived latency is a critical Quality-of-Service (QoS) issue to be considered.