Fault-Tolerant Distributed Mass Storage for LHC Computing

  • Authors:
  • Arne Wiebalck;Peter T. Breuer;Volker Lindenstruth;Timm M. Steinbeck

  • Affiliations:
  • -;-;-;-

  • Venue:
  • CCGRID '03 Proceedings of the 3st International Symposium on Cluster Computing and the Grid
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we present the concept and first prototyping results of a modular fault-tolerant distributed mass storage architecture for large Linux PC clusters as they are deployed by the upcoming particle physics experiments. Thedevice masquerading technique using an Enhanced Network Block Device (ENBD) enables local RAID over remote disks as the key concept of the ClusterRAID system.The block level interface to remote files, partitions or disksprovided by the ENBD makes it possible to use the standard Linux software RAID to add fault-tolerance to the system. Preliminary performance measurements indicate thatthe latency is comparable to a local hard drive. With fourdisks throughput rates of up to 55MB/s were achieved withfirst prototypes for a RAID0 setup, and about 40MB/s for aRAID5 setup.