Parallel I/O Subsystems in Massively Parallel Supercomputers

  • Authors:
  • Dror G. Feitelson;Peter F. Corbett;Sandra Johnson Baylor;Yarsun Hsu

  • Affiliations:
  • -;-;-;-

  • Venue:
  • IEEE Parallel & Distributed Technology: Systems & Technology
  • Year:
  • 1995

Quantified Score

Hi-index 0.00

Visualization

Abstract

Applications on MPPs often require a high aggregate bandwidth of low-latency I/O to secondary storage. This requirement can met by internal parallel I/O subsystems that comprise dedicated I/O nodes, each with processor, memory, and disks.Massively parallel processors (MPPs), encompassing from tens to thousands of processors, are emerging as a major architecture for high-performance computers. Most major computer vendors offer computers with some degree of parallelism, and many smaller vendors specialize in producing MPPs. These machines are targeted for both grand-challenge problems and general-purpose computing.Like any computer, MPP architectural design must balance computation, memory bandwidth and capacity, communication capabilities, and I/O. In the past, most design research focused on the basic compute and communications hardware and software. This led to unbalanced computers that had relatively poor I/O performance. Recently, researchers have focused on designing hardware and software for I/O subsystems in MPPs. Consequently, most current MPPs have an architecture based on an internal parallel I/O subsystem (the "Architectures with parallel I/O" sidebar describes some examples). In these computers, this subsystem encompasses a collection of I/O nodes, each managing and providing I/O access to a set of disks. The I/O nodes connect to other nodes in the system by the same switching network that connects the compute nodes.In this article we'll examine why many MPPs use parallel I/O subsystems, what architecture is best for such a subsystem, and how to implement the subsystem. We'll also discuss how parallel file systems and their user interfaces can exploit the parallel I/O to provide enhanced services to applications.The systems discussed in this article are mostly tightly coupled distributed-memory MIMD (multiple-instruction, multiple-data) MPPs. In some cases, we also discuss shared-memory and SIMD (single-instruction, multiple-data) machines. We'll discuss three node types. Compute nodes are optimized to perform floating-point and numeric calculations, and have no local disk except perhaps for paging, booting, and operating-system software. I/O nodes contain the system's secondary storage, and provide the parallel file-system services. Gateway nodes provide connectivity to external data servers and mass-storage systems. In some cases, individual nodes can serve as more than one type. For example, the same nodes often handle I/O and gateway functions. The "Terminology" sidebar defines some other terms used in this article.