Mapping Parallel Application Communication Topology to Rhombic Overlapping-Cluster Multiprocessors

Authors:
Kenneth Hoganson
Affiliations:
Department of Computer Science and Information Systems, Kennesaw State University, 1000 Chastain Road, Kennesaw, GA 30144 khoganso@kennesaw.edu
Venue:
The Journal of Supercomputing
Year:
2000

Citing 12
Cited 1

Reevaluating Amdahl's law

Communications of the ACM
Cascaded rhombic crossbar interconnection networks

Journal of Parallel and Distributed Computing
A dynamically segmented bus architecture

Computers and Electrical Engineering
Comparative performance of overlapping connectivity multiprocessor interconnection networks

The Computer Journal - Special issue on information systems
Overlapping connectivity interconnection networks for shared memory multiprocessor systems

Journal of Parallel and Distributed Computing
The DASH prototype: implementation and performance

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
On Crossbar Switch and Multiple Bus Interconnection Networks with Overlapping Connectivity

IEEE Transactions on Computers
Reflective interconnection networks

Computers and Electrical Engineering
Performance of Multistage Bus Networks for a Distributed Shared Memory Multiprocessor

IEEE Transactions on Parallel and Distributed Systems
Scalable Shared-Memory Multiprocessing

Scalable Shared-Memory Multiprocessing
Hypercube Multiprocessors with Bus Connections for Improving Communication Performance

IEEE Transactions on Computers
Parallel applications and architectures utilizing bi-directional rhombic interconnection networks with circular and segmented buses

Parallel applications and architectures utilizing bi-directional rhombic interconnection networks with circular and segmented buses

High-performance computer architecture and algorithm simulator

Journal on Educational Resources in Computing (JERIC)

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper extends research into rhombic overlapping-connectivity interconnection networks into the area of parallel applications. As a foundation for a shared-memory non-uniform access bus-based multiprocessor, these interconnection networks create overlapping groups of processors, buses, and memories, forming a clustered computer architecture where the clusters overlap. This overlapping-membership characteristic is shown to be useful for matching parallel application communication topology to the architecture's bandwidth characteristics. Many parallel applications can be mapped to the architecture topology so that most or all communication is localized within an overlapping cluster, at the low latency of processor direct to cache (or memory) over a bus. The latency of communication between parallel threads does not degrade parallel performance or limit the graininess of applications. Parallel applications can execute with good speedup and scaling on a proposed architecture which is designed to obtain maximum advantage from the overlapping-cluster characteristic, and also allows dynamic workload migration without moving the instructions or data. Scalability limitations of bus-based shared-memory multiprocessors are overcome by judicious workload allocation schemes, that take advantage of the overlapping-cluster memberships. Bus-based rhombic shared-memory multiprocessors are examined in terms of parallel speedup models to explain their advantages and justify their use as a foundation for the proposed computer architecture. Interconnection bandwidth is maximized with bi-directional circular and segmented overlapping buses. Strategies for mapping parallel application communication topologies to rhombic architectures are developed. Analytical models of enhanced rhombic multiprocessor performance are developed with a unique bandwidth modeling technique, and are compared with the results of simulation.