A novel dynamic network data replication scheme based on historical access record and proactive deletion

  • Authors:
  • Zhe Wang;Tao Li;Naixue Xiong;Yi Pan

  • Affiliations:
  • Dept. of Computer Science, Sichuan University, Chengdu, China;Dept. of Computer Science, Sichuan University, Chengdu, China;Dept. of Computer Science, Georgia State University, Atlanta, USA;Dept. of Computer Science, Georgia State University, Atlanta, USA

  • Venue:
  • The Journal of Supercomputing
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data replication is becoming a popular technology in many fields such as cloud storage, Data grids and P2P systems. By replicating files to other servers/nodes, we can reduce network traffic and file access time and increase data availability to react natural and man-made disasters. However, it does not mean that more replicas can always have a better system performance. Replicas indeed decrease read access time and provide better fault-tolerance, but if we consider write access, maintaining a large number of replications will result in a huge update overhead. Hence, a trade-off between read access time and write updating cost is needed. File popularity is an important factor in making decisions about data replication. To avoid data access fluctuations, historical file popularity can be used for selecting really popular files. In this research, a dynamic data replication strategy is proposed based on two ideas. The first one employs historical access records which are useful for picking up a file to replicate. The second one is a proactive deletion method, which is applied to control the replica number to reach an optimal balance between the read access time and the write update overhead. A unified cost model is used as a means to measure and compare the performance of our data replication algorithm and other existing algorithms. The results indicate that our new algorithm performs much better than those algorithms.