Improving the data placement algorithm of randomization in SAN

Authors:
Nianmin Yao;Jiwu Shu;Weimin Zheng
Affiliations:
Department of computer science and technology, Tsinghua university;No institute;No institute
Venue:
ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part III
Year:
2005

Citing 7
Cited 1

A case for redundant arrays of inexpensive disks (RAID)

SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Dynamic file allocation in disk arrays

SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
Object-oriented simulation modeling with C++/CSIM17

WSC '95 Proceedings of the 27th conference on Winter simulation
Self-similarity in World Wide Web traffic: evidence and possible causes

Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Compact, adaptive placement schemes for non-uniform requirements

Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
A Fast Algorithm for Online Placement and Reorganization of Replicated Data

IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Scalable Web Server Cluster Design with Workload-Aware Request Distribution Strategy WARD

WECWIS '01 Proceedings of the Third International Workshop on Advanced Issues of E-Commerce and Web-Based Information Systems (WECWIS '01)

Optimization of the Switches in Storage Networks

ICCS '07 Proceedings of the 7th international conference on Computational Science, Part III: ICCS 2007

Quantified Score

Hi-index	0.00

Visualization

Abstract

Using the randomization as the data placement algorithm has many advantages such as simple computation, long term load balancing, and little costs. Especially, some latest works have improved it to make it scale well while adding or deleting disks in large storage systems such as SAN (Storage Area Network). But it still has a shortcoming that it can not ensure load balancing in the short term when there are some very hot data blocks accessed frequently. This situation can often be met in Web environments. To solve the problem, based on the algorithm of randomization, an algorithm to select the hot-spot data blocks and a data placement scheme based on the algorithm are presented in this paper. The difference is that it redistributes a few very hot data blocks to make load balanced in any short time. Using this method, we only need to maintain a few blocks status information about their access frequency and more than that it is easy to implement and costs little. A simulation model is implemented to test the data placement methods of our new one and the one just using randomization. The real Web log is used to simulate the load and the results show that the new distributing method can make disks’ load more balanced and get a performance increased by at most 100 percent. The new data placement algorithm will be more efficient in the storage system of a busy Web server.