Algorithms for non-uniform size data placement on parallel disks

  • Authors:
  • Srinivas Kashyap;Samir Khuller

  • Affiliations:
  • Department of Computer Science, University of Maryland, College Park, MD;Department of Computer Science, University of Maryland, College Park, MD and Institute for Advanced Computer Studies, University of Maryland, College Park, MD

  • Venue:
  • Journal of Algorithms
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We study an optimization problem that arises in the context of data placement in a multimedia storage system. We are given a collection of M multimedia objects (data items) that need to be assigned to a storage system consisting of N disks d1, d2,...,dN. We are also given sets U1, U2,...,UM such that Ui is the set of clients seeking the ith data item. Data item i has size si. Each disk dj is characterized by two parameters, namely, its storage capacity Cj which indicates the maximum total size of data items that may be assigned to it, and a load capacity Lj which indicates the maximum number of clients that it can serve. The goal is to find a placement of data items to disks and an assignment of clients to disks so as to maximize the total number of clients served, subject to the capacity constraints of the storage system.We study this data placement problem for homogeneous storage systems where all the disks are identical. We assume that all disks have a storage capacity of k and a load capacity of L. Previous work on this problem has assumed that all data items have unit size, in other words si = 1 for all i. Even for this case, the problem is NP-hard. For the case where si ∈ {1,..., Δ} for some constant Δ, we develop a polynomial time approximation scheme (PTAS). This result is obtained by developing two algorithms, one that works for constant k and one that works for arbiuary k. The algorithm for arbitrary k guarantees that a solution where at least ((k-Δ)/(k + Δ))(1 - 1/(1 + √k/(2Δ))2)- fraction of all clients are assigned to a disk (under certain assumptions). In addition we develop an algorithm for which we can prove tight bounds when si ∈ {1,2}. In fact, we can show that a (1 - 1/(1 + √⌊k/2⌋)2)-fraction of all clients can be assigned (under certain natural assumptions), regardless of the input distribution.