On characterizing bandwidth requirements of parallel applications

Authors:
Anand Sivasubramaniam;Aman Singla;Umakishore Ramachandran;H. Venkateswaran
Affiliations:
College of Computing, Georgia Institute of Technology, Atlanta, GA;College of Computing, Georgia Institute of Technology, Atlanta, GA;College of Computing, Georgia Institute of Technology, Atlanta, GA;College of Computing, Georgia Institute of Technology, Atlanta, GA
Venue:
Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Year:
1995

Citing 20
Cited 4

The rice parallel processing testbed

SIGMETRICS '88 Proceedings of the 1988 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Performance Analysis of k-ary n-cube Interconnection Networks

IEEE Transactions on Computers
The Stanford Dash Multiprocessor

Computer
Architectural requirements of parallel scientific applications with explicit communication

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Working sets, cache sizes, and node granularity issues for large-scale multiprocessors

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Mechanisms for cooperative shared memory

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
The Wisconsin Wind Tunnel: virtual prototyping of parallel computers

SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Evaluating multigauge architectures for computer vision

Journal of Parallel and Distributed Computing - Special issue on heterogeneous processing
Modeling communication in parallel algorithms: a fruitful interaction between theory and systems?

SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
An approach to scalability study of shared memory parallel systems

SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
A simulation-based scalability study of parallel systems

Journal of Parallel and Distributed Computing - Special issue on scalability of parallel algorithms and architectures
The Tera computer system

ICS '90 Proceedings of the 4th international conference on Supercomputing
Limits on Interconnection Network Performance

IEEE Transactions on Parallel and Distributed Systems
The Scalability of FFT on Parallel Computers

IEEE Transactions on Parallel and Distributed Systems
Performance Analysis of Mesh Interconnection Networks with Deterministic Routing

IEEE Transactions on Parallel and Distributed Systems
A large scale, homogeneous, fully distributed parallel machine, I

ISCA '77 Proceedings of the 4th annual symposium on Computer architecture
Abstracting network characteristics and locality properties of parallel systems

HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
PROTEUS: A HIGH-PERFORMANCE PARALLEL-ARCHITECTURE SIMULATOR

PROTEUS: A HIGH-PERFORMANCE PARALLEL-ARCHITECTURE SIMULATOR
SPLASH: Stanford parallel applications for shared-memory

SPLASH: Stanford parallel applications for shared-memory
The complexity of parallel computations

The complexity of parallel computations

Execution-driven simulators for parallel systems design

Proceedings of the 29th conference on Winter simulation
An Application-Driven Study of Parallel System Overheads and Network Bandwidth Requirements

IEEE Transactions on Parallel and Distributed Systems
Achieving Robustness and Minimizing Overhead in Parallel Algorithms Through Overlapped Communication/Computation

The Journal of Supercomputing - Special issue on embedded fault-tolerance systems
Communication in Parallel Applications: Characterization and Sensitivity Analysis

ICPP '97 Proceedings of the international Conference on Parallel Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Synthesizing architectural requirements from an application viewpoint can help in making important architectural design decisions towards building large scale parallel machines. In this paper, we quantify the link bandwidth requirement on a binary hypercube topology for a set of five parallel applications. We use an execution-driven simulator called SPASM to collect data points for system sizes that are feasible to be simulated. These data points are then used in a regression analysis for projecting the link bandwidth requirements for larger systems. The requirements are projected as a function of the following system parameters: number of processors, CPU clock speed, and problem size. These results are also used to project the link bandwidths for other network topologies. Our study quantifies the link bandwidth that has to be made available to limit the network overhead in an application to a specified tolerance level. The results show that typical link bandwidths (200-300 MBytes/sec) found in current commercial parallel architectures (such as Intel Paragon and Cray T3D) would have fairly low network overhead for the applications considered in this study. For two of the applications, this overhead is negligible. For the other applications, this overhead can be limited to about 30% of the execution time provided the problem sizes are increased commensurate with the processor clock speed. The technique presented can be useful to a system architect to synthesize the bandwidth requirements for realizing well-balanced parallel architectures.