The Second Trans-Pacific Grid Datafarm Testbed and Experiments for SC2003

  • Authors:
  • Osamu Tatebe;Hirotaka Ogawa;Yuetsu Kodama;Tomohiro Kudoh;Satoshi Sekiguchi;Satoshi Matsuoka;Kento Aida;Taisuke Boku;Mitsuhisa Sato;Youhei Morita;Yoshinori Kitatsuji;Jim Williams;John Hicks

  • Affiliations:
  • -;-;-;-;-;-;-;-;-;-;-;-;-

  • Venue:
  • SAINT-W '04 Proceedings of the 2004 Symposium on Applications and the Internet-Workshops (SAINT 2004 Workshops)
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Grid Datafarm architecture is designed for global petascale data-intensive computing. It provides a global parallel file system (Gfarm file system) with online petascale storage, scalable I/O bandwidth, and scalable parallel processing by federating thousands of local file systems in a grid of clusters securely using Grid security infrastructure. One of features is that it manages file replicas in filesystem metadata for fault tolerance and load balancing. Here, wepresent an overview of our planned experiment to be performed as the SC2003 Bandwidth Challenge at the Supercomputing 2003 site in Phoenix, Arizona, USA. In the experiment,five clusters in Japan and three clusters in US will comprise a Gfarm file system, on which world-wide largescale data analysis will be performed. In the Gfarm file system, a file is dispersed in several cluster nodes, each of which is replicated independently and in parallel by multiple third-party transfers between multiple cluster nodes. For the Challenge, terabyte-scale experimental data will be replicated between US and Japan via APAN/TransPAC and SuperSINET (about 10,000 km or 6,000 miles). At the workshop we expect to present the full detail of the experiment.