dCache, storage system for the future

  • Authors:
  • Patrick Fuhrmann;Volker Gülzow

  • Affiliations:
  • Deutsches Elektronen Synchrotron, Hamburg;Deutsches Elektronen Synchrotron, Hamburg

  • Venue:
  • Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In 2007, the most challenging high energy physics experiment ever, the Large Hardon Collider(LHC), at CERN, will produce a sustained stream of data in the order of 300MB/sec, equivalent to a stack of CDs as high as the Eiffel Tower once per week. This data is, while produced, distributed and persistently stored at several dozens of sites around the world, building the LHC data grid. The destination sites are expected to provide the necessary middle-ware, so called Storage Elements, offering standard protocols to receive the data and to store it at the site specific Storage Systems. A major player in the set of Storage Elements is the dCache/SRM system. dCache/SRM has proven to be capable of managing the storage and exchange of several hundreds of terabytes of data, transparently distributed among dozens of disk storage nodes. One of the key design features of the dCache is that although the location and multiplicity of the data is autonomously determined by the system, based on configuration, cpu load and disk space, the name space is uniquely represented within a single file system tree. The system has shown to significantly improve the efficiency of connected tape storage systems, by caching, 'gather & flush' and scheduled staging techniques. Furthermore, it optimizes the throughput to and from data clients as well as smoothing the load of the connected disk storage nodes by dynamically replicating datasets on the detection of load hot spots. The system is tolerant against failures of its data servers which enables administrators to go for commodity disk storage components. Access to the data is provided by various standard protocols. Furthermore the software is coming with an implementation of the Storage Resource Manager protocol (SRM), which is evolving to an open standard for grid middleware to communicate with site specific storage fabrics.