On input/output speedup in tightly coupled multiprocessors

Authors:
Walid Abu-Sufah;Harlan E. Husmann;David J. Kuck
Affiliations:
Univ. of Illinois, Urbana;Univ. of Illinois, Urbana;Univ. of Illinois, Urbana
Venue:
IEEE Transactions on Computers - The MIT Press scientific computation series
Year:
1986

Citing 21
Cited 1

A technique for reducing synchronization overhead in large scale multiprocessors

ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
Performance prediction tools for Cedar: a multiprocessor supercomputer

ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
Practical Parallel Band Triangular System Solvers

ACM Transactions on Mathematical Software (TOMS)
A Survey of Parallel Machine Organization and Programming

ACM Computing Surveys (CSUR)
Structure of Computers and Computations

Structure of Computers and Computations
Automatic program restructuring for high-speed computation

CONPAR '81 Proceedings of the Conference on Analysing Problem Classes and Programming for Parallel Computing
Optimizing supercompilers for supercomputers

Optimizing supercompilers for supercomputers
Compile-time scheduling and optimization for asynchronous machines (multiprocessor, compiler, parallel processing)

Compile-time scheduling and optimization for asynchronous machines (multiprocessor, compiler, parallel processing)
Compiler optimizations and architecture design issues for multiprocessors (parallel)

Compiler optimizations and architecture design issues for multiprocessors (parallel)
A VLSI-Based I/O Formatting Device

IEEE Transactions on Computers
The NYU Ultracomputer Designing an MIMD Shared Memory Parallel Computer

IEEE Transactions on Computers
A Massive Memory Machine

IEEE Transactions on Computers
High-Speed Multiprocessors and Compilation Techniques

IEEE Transactions on Computers
Parallelism and Representation Problems in Distributed Systems

IEEE Transactions on Computers
A Survey of Highly Parallel Computing

Computer
Why Systolic Architectures?

Computer
Introduction to the Configurable, Highly Parallel Computer

Computer
Special Feature Program Measurements on a High-Level Language Computer

Computer
Data Flow Supercomputers

Computer
STARAN parallel processor system hardware

AFIPS '74 Proceedings of the May 6-10, 1974, national computer conference and exposition
An overview of the Texas reconfigurable array computer

AFIPS '80 Proceedings of the May 19-22, 1980, national computer conference

A study of I/O behavior of perfect benchmarks on a multiprocessor

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

Previous models of program speedup on parallel architectures tend to ignore I/O activity and other important issues. In this paper we derive analytic speedup models including I/O activities. We show that ignoring I/O yields conservative speedup results. We explore the effectiveness of using hardware format conversion units in multiprocessors [33]. We prove that hardware parallel format conversion loses its edge over software parallel format conversion if the ratio of the number of processors to I/O bandwidth increases. For a given number of processors, program speedup is more sensitive to the available I/O bandwidth rather than the format conversion speed. Ninety-one Fortran programs are used in various experiments to verify our models and conclusions. Most of the programs are I/O bound. Our empirical results show that including I/O activity improves the speedup factor for 78 percent of the programs, and 18 percent of the programs are sped up only due to faster I/O activities. For a serial machine, using hardware format conversion units designed in [13] reduces program execution time by an average factor of three. The software format conversion speed used is obtained from direct measurements on an IBM 4341 running CMS and a CDC Cyber 175 running NOS. For multiprocessor systems a factor of eight increase in the processors to I/O bandwidth ratio reduces the effectiveness of hardware format conversion to an average factor of 1.36.