CSAR-2: a case study of parallel file system dependability analysis

  • Authors:
  • D. Cotroneo;G. Paolillo;S. Russo;M. Lauria

  • Affiliations:
  • Dipartimento di Informatica e Sistemistica, Universita’ di Napoli “Federico II”, Napoli, Italy;Dipartimento di Informatica e Sistemistica, Universita’ di Napoli “Federico II”, Napoli, Italy;Dipartimento di Informatica e Sistemistica, Universita’ di Napoli “Federico II”, Napoli, Italy;Dept. of Computer Science and Engineering, The Ohio State University, Columbus, OH

  • Venue:
  • HPCC'05 Proceedings of the First international conference on High Performance Computing and Communications
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Modern cluster file systems such as PVFS that stripe files across multiple nodes have shown to provide high aggregate I/O bandwidth but are prone to data loss since the failure of a single disk or server affects the whole file system. To address this problem a number of distributed data redundancy schemes have been proposed that represent different trade-offs between performance, storage efficiency and level of fault tolerance. However the actual level of dependability of an enhanced striped file system is determined by more than just the redundancy scheme adopted, depending in general on other factors such as the type of fault detection mechanism, the nature and the speed of the recovery. In this paper we address the question of how to assess the dependability of CSAR, a version of PVFS augmented with a RAID5 distributed redundancy scheme we described in a previous work.