OS support for a commodity database on PC clusters: distributed devices vs. distributed file systems

  • Authors:
  • Felix Rauch;Thomas M. Stricker

  • Affiliations:
  • Swiss Federal Institute of Technology, Zurich, Switzerland;Swiss Federal Institute of Technology, Zurich, Switzerland

  • Venue:
  • ADC '05 Proceedings of the 16th Australasian database conference - Volume 39
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we attempt to parallelise a commodity database for OLAP on a cluster of commodity PCs by using a distributed high-performance storage subsystem. By parallelising the underlying storage architecture we eliminate the need to make any changes to the database software. We look at two options that differ in their complexity and features: Distributed devices and distributed file systems. The former aggregates several single disks within the cluster into a RAID device across the network. The latter offers all the features of a real file system at the price of a considerably increased complexity. We configured a Linux version of ORACLE to run on various distributed devices or distributed file systems, respectively, and ran a TPC-D benchmark on our cluster of commodity PCs interconnected by a Gigabit Ethernet. While distributed devices achieve at least the performance of local disks, they offer the benefit of using all surplus storage in a cluster. The distributed file systems seem to run into performance problems due to their increased complexity. We explain the experimental results with an analytic model of the cluster architecture and include a comparison of the same workload on an architecture that distributes the TPC-D queries at a higher level (and not just the underlying storage system). We conclude with suggestions for higher performances in future clusters of commodity PCs.