Software system of the earth simulator

  • Authors:
  • Takashi Yanagawa;Kenji Suehiro

  • Affiliations:
  • 1st Computers Software Division, NEC Corporation, 1-10, Nisshin-cho, Fuchu, Tokyo 183-8501, Japan;1st Computers Software Division, NEC Corporation, 1-10, Nisshin-cho, Fuchu, Tokyo 183-8501, Japan

  • Venue:
  • Parallel Computing
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Earth Simulator (ES) is a large scale, distributed memory, parallel computer system consisting of 640 processor nodes (PN) with shared memory vector multiprocessors (64GFLOPS/PN, 5120 APs in total, AP: arithmetic processor). All the nodes are connected via a high speed (16GB/s) single-stage crossbar network called the Interconnection Network (IN).The operating system for the Earth Simulator is based on SUPER-UX, the UNIX operating system for the SX series scientific supercomputers. In order to realize high-performance parallel processing on the highly parallel machine, the operating system is enhanced for scalability.The Earth Simulator system is managed as a two-level cluster system called the Super Cluster System. In the Super Cluster System, the Earth Simulator system is divided into 40 clusters (16PNs/cluster). A single controller called Super Cluster Control Station (SCCS) manages all these clusters. This management system provides Single System Image (SSI) operation, management and job control for the large scale multi-node system.The Job Scheduler (JS) and NQS running on the SCCS control all jobs of the system. They schedule the resources such as processing nodes and files which have not usually been treated as scheduling resources. This allows efficient scheduling of large scale jobs.The MPI library (MPI/ES) and the HPF compiler (HPF/ES) are available for distributed parallel programming on the Earth Simulator. MPI/ES conforms to the MPI 2.0 standard and is optimized to exploit the hardware features. HPF/ES conforms to the core part of HPF 2.0 and supports some features of the HPF 2.0 approved extensions and HPF/JA 1.0 extensions. HPF/ES suitably handles the 3-level parallelism of the Earth Simulator system, that is, vectorization, shared-memory parallelization, and distributed-memory parallelization. Moreover, HPF/ES extends the language to easily handle irregular problems.