Belle-DIRAC Setup for Using Amazon Elastic Compute Cloud

  • Authors:
  • Ricardo Graciani Diaz;Adria Casajus Ramo;Ana Carmona Agüero;Thomas Fifield;Martin Sevior

  • Affiliations:
  • University of Barcelona, Barcelona, Spain;University of Barcelona, Barcelona, Spain;University of Barcelona, Barcelona, Spain;University of Melbourne, Melbourne, Australia;University of Melbourne, Melbourne, Australia

  • Venue:
  • Journal of Grid Computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Distributed Infrastructure with Remote Agent Control (DIRAC) software framework allows a user community to manage computing activities in a distributed environment. DIRAC has been developed within the Large Hadron Collider Beauty (LHCb) collaboration. After successful usage over several years, it is the final solution adopted by the experiment. The Belle experiment at the Japanese High Energy Accelerator Research Organization (KEK) has the purpose of studying matter/anti-matter asymmetries using B mesons. During its lifetime, Belle detector has collected about 5,000 terabytes of real and simulated data. The analysis of this data requires an enormous amount of computing intensive Monte Carlo simulation. The Belle II experiment, which recently published its technical design report, will produce 50 times more data. Therefore it is interested to determine if commercial computing clouds can reduce the total cost of the experiment's computing solution. This paper describes the setup prepared to evaluate the performance and cost of this approach using real 2010 simulation tasks of the Belle experiment. The setup has been developed using DIRAC as the overall management tool to control both the tasks to be executed and the deployment of virtual machines using the Amazon Elastic Compute Cloud as service provider. At the same time, DIRAC is also used to monitor the execution, collect the necessary statistical data, and finally upload the results of the simulation to Belle resources on the Grid. The results of a first test using over 2000 days of cpu time show that over 90% efficiency in the use of the resources can easily be achieved.