PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
Emergent algorithms for replica location and selection in data grid
Future Generation Computer Systems
Overview of Medical Data Management Solutions for Research Communities
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
A Decentralized Deployment Strategy and Performance Evaluation of LCG File Catalog Service
Journal of Grid Computing
Hi-index | 0.00 |
The Large Hadron Collider (LHC) at CERN, the European Organization for Nuclear Research, needs to produce unprecedented volumes of data when it starts operation in 2007. To provide for its computational needs, the LHC computing grid (LCG) should be deployed as a worldwide computational grid service, providing the middleware upon which the physics analysis for the LHC can be carried out. In 2003, versions of this middleware were deployed which were based on the middleware produced by the European Data Grid project (EDG). In 2004 the LCG-2 release, which consisted of the EDG middleware with some minor modifications, was deployed for use by the LHC experiments. A series of data challenges by these experiments were the first real experiment production use of LCG. During the course of the data challenges, many issues and problems were exposed which had not shown up in more limited tests. The deployment, service and development teams worked closely with the experiments to understand these issues and while some of the problems were solved during the data challenges, others exposed fundamental problems with the middleware as deployed in LCG-2. One of these fundamental problems was the performance under real load of the catalog component provided by EDG, the replica location service. To solve these problems a new component was designed, the LCG file catalog (LFC). The LFC moves away from the replica location service model used in previous LCG releases, towards a hierarchical file system model which is more like a UNIX file system. It also adds missing functionality which was requested by the experiments. This paper presents the architecture and implementation of the LFC and evaluates it in a series of performance tests, with up to forty million entries and one hundred requesting threads from multiple clients. The results show good scalability up to the limits of these tests, and compare favourably with other grid catalog implementations.