Data Placement in P2P Data Grids Considering the Availability, Security, Access Performance and Load Balancing

  • Authors:
  • Manghui Tu;Hui Ma;Liangliang Xiao;I. -Ling Yen;Farokh Bastani;Dianxiang Xu

  • Affiliations:
  • Department of CITG, Purdue University Calumet, Hammond, USA;Cisco Systems, Inc., Austin, USA;Department of Computer Science, University of Texas at Dallas, Dallas, USA;Department of Computer Science, University of Texas at Dallas, Dallas, USA;Department of Computer Science, University of Texas at Dallas, Dallas, USA;College of Business and Information Systems, Dakota State University, Madison, USA

  • Venue:
  • Journal of Grid Computing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data dependability is an important issue in data Grids. Replication schemes have been widely used in distributed systems to ensure availability and improve access performance. Alternatively, data partitioning schemes (secret sharing, erasure coding with encryption) can be used to provide availability and, in addition, to offer confidentiality protection. In peer-to-peer data Grids, such confidentiality protection is essential since the nodes hosting the data shares may not be trustworthy or may be compromised. However, difficulties in generating new shares and potential security concerns for share reallocation make a pure data partitioning scheme not easily adaptable to dynamic user access patterns. In this paper, we consider combining replication and data partitioning to assure data availability, confidentiality, load balance, and efficient access for data Grid applications. Data are partitioned and shares are dispersed. The shares may be replicated to achieve better performance, load balance, and availability. Models for assessing confidentiality, availability, load balance, and communication cost are developed and used as the metrics to guide placement decisions. Due to the nature of contradicting goals, we model the placement decision problem as a multi-objective problem and use a genetic algorithm to determine solutions that are approximate to the Pareto optimal placement solutions.