GCplace: geo-cloud based correlation aware data replica placement

Authors:
Zhen Ye;Shanping Li;Xiaozhen Zhou
Affiliations:
Zhejiang University, Hangzhou, China;Zhejiang University, Hangzhou, China;Zhejiang University, Hangzhou, China
Venue:
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Year:
2013

Citing 14
Cited 0

The Google file system

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Vivaldi: a decentralized network coordinate system

Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications
Static and adaptive distributed data replication using genetic algorithms

Journal of Parallel and Distributed Computing
QoS-Aware Replica Placement for Content Distribution

IEEE Transactions on Parallel and Distributed Systems
Dynamo: amazon's highly available key-value store

Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
A framework for clustering evolving data streams

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Comparison and analysis of ten static heuristics-based Internet data replication techniques

Journal of Parallel and Distributed Computing
Cassandra: a decentralized structured storage system

ACM SIGOPS Operating Systems Review
Volley: automated data placement for geo-distributed cloud services

NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
Network coordinates in the wild

NSDI'07 Proceedings of the 4th USENIX conference on Networked systems design & implementation
Towards Optimal Data Replication Across Data Centers

ICDCSW '11 Proceedings of the 2011 31st International Conference on Distributed Computing Systems Workshops
Windows Azure Storage: a highly available cloud storage service with strong consistency

SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
S-CLONE: Socially-aware data replication for social networks

Computer Networks: The International Journal of Computer and Telecommunications Networking
A Latency-Aware Co-deployment Mechanism for Cloud-Based Services

CLOUD '12 Proceedings of the 2012 IEEE Fifth International Conference on Cloud Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Cross datacenter data replication has been widely used in geo-cloud environment due to its ability to increase application's availability and improve the performance. However, with the large scale of cloud, it is difficult to determine the location of replicas among datacenters in order to minimize overall user access latency. The data correlation between each other makes replica placement problem more complex. To address these large scale and data correlation issues, we propose a two-step approach called GCplace. Before applying GCplace, a network coordinate system is used to predict the latency between all users and datacenter nodes. In the first step of GCplace, we introduce a stream based similarity clustering, which uses a small number of micro clusters to represent huge number of users and thus significantly reducing the cost of replica placement algorithm. In the second step, an iterative algorithm is proposed to get an approximation solution. We evaluated our approach by using a large scale real network latency dataset. Comprehensive experiments show that GCplace can reduce average user access latency significantly.