The little engine(s) that could: scaling online social networks
Proceedings of the ACM SIGCOMM 2010 conference
Volley: automated data placement for geo-distributed cloud services
NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
Multilevel algorithms for partitioning power-law graphs
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Hi-index | 0.00 |
Large-Scale Online Social Networks (OSNs) usually employ data replication across multiple datacenters in multiple geo-locations to ensure high availability and performance [1]. The de facto method for data replication in current OSNs (e.g., Facebook) is full replication which enables each geo-distributed datacenter to maintain one copy of all the data. The full replication method can simply achieve good performance but poses high overhead for maintenance (e.g., replica storage and synchronization). Firstly, full replication leads to linear storage growth with the increasing of datacenter deployment, which is of poor scalability. Secondly, the data replicas across all the locations requires synchronization, resulting in large inter-datacenter WAN traffic which is very expensive. The ideal solution is to partition user data across multiple datacenters, making each geo-distributed datacenter to maintain one partition of the whole data set. Unfortunately, partitioning OSN data by tradition graph algorithms is known to be very difficult due to the high interconnection and inter-dependency within the OSN data [2]. Besides, geo-partitioning goes beyond the traditional graph partitioning problems because the user-perceived latency is a critical Quality-of-Service (QoS) issue to be considered.