Sustainable adaptive grid supercomputing: multiscale simulation of semiconductor processing across the pacific

  • Authors:
  • Hiroshi Takemiya;Yoshio Tanaka;Satoshi Sekiguchi;Shuji Ogata;Rajiv K. Kalia;Aiichiro Nakano;Priya Vashishta

  • Affiliations:
  • National Institute of Advanced Industrial Science and Technology, Japan;National Institute of Advanced Industrial Science and Technology, Japan;National Institute of Advanced Industrial Science and Technology, Japan;Nagoya Institute of Technology, Japan;University of Southern California;University of Southern California;University of Southern California

  • Venue:
  • Proceedings of the 2006 ACM/IEEE conference on Supercomputing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a reservation-based sustainable adaptive Grid supercomputing paradigm to enable tightly coupled computations of considerable scale (involving over 1,000 processors) and duration (over tens of continuous days) on a Grid of geographically distributed parallel supercomputers. The paradigm is demonstrated for an adaptive multiscale simulation application, in which accurate but compute-intensive quantum mechanical (QM) simulations are embedded within a classical molecular dynamics (Md) simulation only when and where high fidelity is required. Key technical innovations include: 1) an embedded divide-and-conquer algorithmic framework to maximally expose data and computation localities for enhanced scalability; 2) a buffered-cluster hybridization scheme to adaptively adjust MD/QM boundaries to maintain the model accuracy; and 3) a hybrid Grid remote procedure call (GridRPC) + message passing interface (MPI) Grid application framework to combine flexibility (adaptive resource allocation and migration), fault tolerance (automated fault recovery), and efficiency (scalable management of large computing resources). We have achieved an automated execution of multiscale MD/QM simulation on a Grid consisting of 6 supercomputer centers in Japan and the US (in total of 150 thousand processor-hours) for the dynamic simulation of implanted oxygen atoms in a silicon substrate, in which the number of processors changes dynamically on demand and resources are allocated and migrated dynamically according to both reservations and unexpected faults. The simulation results reveal a strong dependence of the oxygen penetration depth on the incident oxygen-beam position, which is useful information to further advance SIMOX (separation by implanted oxygen) technique to fabricate high speed and low power-consumption semiconductor devices.