Design of file size and type of access based replication algorithm for data grid

  • Authors:
  • K. Jain;A. V. Vidhate;V. Wangikar;S. Shah

  • Affiliations:
  • Xavier Institute of Engineering, Mumbai, India;Xavier Institute of Engineering, Mumbai, India;K.C. College of Engineering, Mumbai, India;Vidyalankar Institute of Technology, Mumbai, India

  • Venue:
  • Proceedings of the International Conference & Workshop on Emerging Trends in Technology
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Grid computing provides the effective sharing of computational and storage resources among geographically distributed users. In most of the organizations, there are large amounts of underutilized computing power and storage existing. On the other hand most desktop machines are busy less than 5 percent of the time. Grid computing provides a framework for exploiting these underutilized resources and thus increases the efficiency of resource usage. Now a days, many commercial, business and research institutes produce huge amount of data and need to store this data on secondary storage of machines. Users of data are distributed among different geographical boundaries and they want to collaborate on the same problem. Data grids focus on providing secure access to distributed, heterogeneous pools of data. Data grids harness data, storage, and network resources located in distinct administrative domains, and provide high speed and reliable access to data. Optimization of data access can be achieved via data replication, whereby identical copies of data are generated and stored at various sites. A good replication strategy should ideally minimize latencies; reduce access time while optimizing resources. Hence in this paper we have focused on improving data grid performance. We have first presented a detailed analysis of various replication strategies like No replication, Always replication and the Economic model simulated by OptorSim. Next we have proposed a dynamic replication strategy which switches between No replication and Always replication based on file size and type of access. We argue that the proposed file size and type of access based replication algorithm will minimize the access latencies and execution time of jobs on Grid. In the next phase we shall proceed with implementing algorithm on simulator and observing the performance of the implemented algorithm.