FLUX interconnection networks on demand

Authors:
Stamatis Vassiliadis;Ioannis Sourdis
Affiliations:
Computer Engineering, TU Delft, The Netherlands;Computer Engineering, TU Delft, The Netherlands
Venue:
Journal of Systems Architecture: the EUROMICRO Journal
Year:
2007

Citing 11
Cited 2

Fat-trees: universal networks for hardware-efficient supercomputing

IEEE Transactions on Computers
Introduction to algorithms

Introduction to algorithms
Hypercube algorithms: with applications to image processing and pattern recognition

Hypercube algorithms: with applications to image processing and pattern recognition
Introduction to parallel algorithms and architectures: array, trees, hypercubes

Introduction to parallel algorithms and architectures: array, trees, hypercubes
Embedding of Complete Binary Trees into Meshes with Row-Column Routing

IEEE Transactions on Parallel and Distributed Systems
Sorting on a mesh-connected parallel computer

Communications of the ACM
The tree machine: a highly concurrent computing environment

The tree machine: a highly concurrent computing environment
The MOLEN Polymorphic Processor

IEEE Transactions on Computers
Combinatorial Algorithms: Theory and Practice

Combinatorial Algorithms: Theory and Practice
The Midlifekicker Microarchitecture Evaluation Metric

ASAP '05 Proceedings of the 2005 IEEE International Conference on Application-Specific Systems, Architecture Processors
Introduction to the Configurable, Highly Parallel Computer

Computer

Networks-on-chip based on dynamic wormhole packet identity mapping management

VLSI Design
Design Trade-offs in Customized On-chip Crossbar Schedulers

Journal of Signal Processing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we introduce the FLUX interconnection networks, a scheme where the interconnections of a parallel system are established on demand before or during program execution. We present a programming paradigm which can be utilized to make the proposed solution feasible. We perform several experiments to show the viability of our approach and the potential performance gain of using the most suitable network configuration for a given parallel program. We experiment on several case studies, evaluate different algorithms, developed for meshes or trees, and map them on ''grid''-like or reconfigurable physical interconnection networks. Our results clearly show that, based on the underlying network, different mappings are suitable for different algorithms. Even for a single algorithm different mappings are more appropriate, when the processing data size, the number of utilized nodes or the hardware cost of the processing elements changes. The implication of the above is that changing interconnection topologies/mappings (dynamically) on demand depending on the program needs can be beneficial.