GridX1: A Canadian computational grid

  • Authors:
  • A. Agarwal;M. Ahmed;A. Berman;B. L. Caron;A. Charbonneau;D. Deatrich;R. Desmarais;A. Dimopoulos;I. Gable;L. S. Groer;R. Haria;R. Impey;L. Klektau;C. Lindsay;G. Mateescu;Q. Matthews;A. Norton;W. Podaima;D. Quesnel;R. Simmonds;R. J. Sobie;B. St. Arnaud;C. Usher;D. C. Vanderster;M. Vetterli;R. Walker;M. Yuen

  • Affiliations:
  • Department of Physics and Astronomy, University of Victoria, Victoria, British Columbia, Canada;National Research Council, Ottawa, Ontario, Canada;Department of Physics and Astronomy, University of Victoria, Victoria, British Columbia, Canada;TRIUMF, Vancouver, British Columbia, Canada and Department of Physics, University of Alberta, Edmonton, Alberta, Canada;National Research Council, Ottawa, Ontario, Canada;TRIUMF, Vancouver, British Columbia, Canada;Department of Physics and Astronomy, University of Victoria, Victoria, British Columbia, Canada;Department of Physics and Astronomy, University of Victoria, Victoria, British Columbia, Canada;Department of Physics and Astronomy, University of Victoria, Victoria, British Columbia, Canada and HEPnet, Department of Physics and Astronomy, University of Victoria, Victoria, British Columbia, ...;Department of Physics, University of Toronto, Toronto, Ontario, Canada;National Research Council, Ottawa, Ontario, Canada;National Research Council, Ottawa, Ontario, Canada;Department of Physics and Astronomy, University of Victoria, Victoria, British Columbia, Canada;Department of Physics and Astronomy, University of Victoria, Victoria, British Columbia, Canada;National Research Council, Ottawa, Ontario, Canada;Department of Physics and Astronomy, University of Victoria, Victoria, British Columbia, Canada;Department of Physics and Astronomy, University of Victoria, Victoria, British Columbia, Canada;National Research Council, Ottawa, Ontario, Canada;CANARIE Inc., Ottawa, Ontario, Canada;Department of Computer Science, University of Calgary, Calgary, Alberta, Canada;Institute of Particle Physics of Canada, Canada and Department of Physics and Astronomy, University of Victoria, Victoria, British Columbia, Canada;CANARIE Inc., Ottawa, Ontario, Canada;Department of Physics and Astronomy, University of Victoria, Victoria, British Columbia, Canada;Department of Physics and Astronomy, University of Victoria, Victoria, British Columbia, Canada and Department of Electrical and Computer Engineering, University of Victoria, Victoria, British Col ...;TRIUMF, Vancouver, British Columbia, Canada and Department of Physics, Simon Fraser University, Burnaby, British Columbia, Canada;Department of Physics, Simon Fraser University, Burnaby, British Columbia, Canada;Department of Physics and Astronomy, University of Victoria, Victoria, British Columbia, Canada

  • Venue:
  • Future Generation Computer Systems
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The present paper discusses the design and application of GridX1, a computational grid project which uses shared resources at several Canadian research institutions. The infrastructure of GridX1 is built using off-the-shelf Globus Toolkit 2 middleware, a MyProxy credential server, and a resource broker based on Condor-G to manage the distributed computing environment. The broker-based job scheduling and management functionality are exposed as a Globus GRAM job service. Resource brokering is based on the Condor matchmaking mechanism, whereby job and resource attributes are expressed as ClassAds, with the attributes Requirements and Rank being used to define respectively the constraints and preferences that the matched entity must meet. Various strategies for ranking resources are presented, including an Estimated-Waiting-Time (EWT) algorithm, a throttled load balancing strategy, and a novel external ranking strategy based on data location. One of the unique features is a mechanism which transparently presents the GridX1 resources as a single compute element to the LHC Computing Grid (LCG), based at the CERN Laboratory in Geneva. This interface was used during the ATLAS data challenge 2 to federate the Canadian resources into the LCG without the overhead of maintaining separate LCG sites. Further, the BaBar particle physics simulation has been adapted to execute on GridX1 and resulted in a simplified management of the production. The usage of the throttled EWT and load balancing strategies combined with external data ranking was found to be very effective in improving efficiency and reducing the job failure rate.