Improving the data placement algorithm of randomization in SAN

  • Authors:
  • Nianmin Yao;Jiwu Shu;Weimin Zheng

  • Affiliations:
  • Department of computer science and technology, Tsinghua university;No institute;No institute

  • Venue:
  • ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part III
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Using the randomization as the data placement algorithm has many advantages such as simple computation, long term load balancing, and little costs. Especially, some latest works have improved it to make it scale well while adding or deleting disks in large storage systems such as SAN (Storage Area Network). But it still has a shortcoming that it can not ensure load balancing in the short term when there are some very hot data blocks accessed frequently. This situation can often be met in Web environments. To solve the problem, based on the algorithm of randomization, an algorithm to select the hot-spot data blocks and a data placement scheme based on the algorithm are presented in this paper. The difference is that it redistributes a few very hot data blocks to make load balanced in any short time. Using this method, we only need to maintain a few blocks status information about their access frequency and more than that it is easy to implement and costs little. A simulation model is implemented to test the data placement methods of our new one and the one just using randomization. The real Web log is used to simulate the load and the results show that the new distributing method can make disks’ load more balanced and get a performance increased by at most 100 percent. The new data placement algorithm will be more efficient in the storage system of a busy Web server.