Using classification techniques to improve replica selection in data grid

  • Authors:
  • Hai Jin;Jin Huang;Xia Xie;Qin Zhang

  • Affiliations:
  • Cluster and Grid Computing Lab, Huazhong University of Science and Technology, Wuhan, China;Cluster and Grid Computing Lab, Huazhong University of Science and Technology, Wuhan, China;Cluster and Grid Computing Lab, Huazhong University of Science and Technology, Wuhan, China;Cluster and Grid Computing Lab, Huazhong University of Science and Technology, Wuhan, China

  • Venue:
  • ODBASE'06/OTM'06 Proceedings of the 2006 Confederated international conference on On the Move to Meaningful Internet Systems: CoopIS, DOA, GADA, and ODBASE - Volume Part II
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data grid is developed to facilitate sharing data and resources located in different parts of the world The major barrier to support fast data access in a data grid is the high latency of wide area networks and the Internet Data replication is adopted to improve data access performance When different sites hold replicas, there are significant benefits while selecting the best replica In this paper, we propose a new replica selection strategy based on classification techniques In this strategy the replica selection problem is regarded as a classification problem The data transfer history is utilized to help predicting the best site holding the replica The adoption of the switch mechanism of replica selection model avoids a waste of time for inaccurate classification results In this paper, we study and simulate KNN and SVM methods for different file access patterns and compare results with the traditional replica catalog model The results show that our replica selection model outperforms the traditional one for certain file access requests.