Dynamic Data Reallocation for Skew Management inShared-Nothing Parallel Databases

  • Authors:
  • Abdelsalam (Sumi) Helal;David Yuan;Hesham El-Rewini

  • Affiliations:
  • Microelectronics and Computer Technology Corporation (MCC), Austin, Texas 78759. E-mail: helal@mcc.com;MCI Telecommunications, 901 International Pkwy, Richardson, TX 75081. E-mail: David.Yuan@mci.com;Department of Computer Science, University of Nebraska at Omaha, Omaha, NE 68182-0500. E-mail: rewini@csalpha.unomaha.edu

  • Venue:
  • Distributed and Parallel Databases
  • Year:
  • 1997

Quantified Score

Hi-index 0.02

Visualization

Abstract

The shared nothing parallel database architecture isgaining wide popularity due to its scalability and increased dataavailability. However, in order to efficiently utilize parallelism insuch architecture, independent data sets must be assigned todifferent processing nodes. This, of course, can initially beachieved by employing a careful partitioning scheme that allocatesdisjoint data sets to different processors. However, variations inthe data access pattern may render some processors overloaded whileothers underloaded. This skewness in data access decreases theeffective parallelism and eventually leads to overall performancedegradation. A number of solutions have been proposed to periodicallyperform data re-allocation to remove the skewness in data access.Most of the proposed solutions perform either static re-allocationthat requires the system to be taken off-line or dynamic, butnon-transactional, re-allocation. In this paper, we introduce adynamic and transactional re-allocation scheme based on the work ondisk cooling in shared memory architecture by Scheuermann et al. Theproposed scheme enhances the effective parallelism in the systemregardless of the variations in the pattern of access. The proposedscheme detects access skew as it occurs and re-allocates datapartitions to underloaded processing elements on the fly. Only theblock being moved becomes unavailable. In addition, mutualconsistency among transactions concurrent to the re-allocation eventis preserved. The proposed scheme also uses replication as anadditional cooling mechanism to help distribute access load overmultiple replicas. We conducted a series of simulation experiments tostudy the behavior of shared nothing parallel database systems withand without the proposed dynamic re-allocation scheme. We alsoexperimented with several replication strategies to measure theirimpact on the system performance. Finally, we studied the effect ofusing different concurrency control strategies on the efficiency ofdynamic re-allocation.