Fat-trees: universal networks for hardware-efficient supercomputing
IEEE Transactions on Computers
Introduction to algorithms
Hypercube algorithms: with applications to image processing and pattern recognition
Hypercube algorithms: with applications to image processing and pattern recognition
Introduction to parallel algorithms and architectures: array, trees, hypercubes
Introduction to parallel algorithms and architectures: array, trees, hypercubes
Embedding of Complete Binary Trees into Meshes with Row-Column Routing
IEEE Transactions on Parallel and Distributed Systems
Sorting on a mesh-connected parallel computer
Communications of the ACM
The tree machine: a highly concurrent computing environment
The tree machine: a highly concurrent computing environment
The MOLEN Polymorphic Processor
IEEE Transactions on Computers
Combinatorial Algorithms: Theory and Practice
Combinatorial Algorithms: Theory and Practice
The Midlifekicker Microarchitecture Evaluation Metric
ASAP '05 Proceedings of the 2005 IEEE International Conference on Application-Specific Systems, Architecture Processors
Design Trade-offs in Customized On-chip Crossbar Schedulers
Journal of Signal Processing Systems
Hi-index | 0.00 |
In this paper, we introduce the FLUX interconnection networks, a scheme where the interconnections of a parallel system are established on demand before or during program execution. We present a programming paradigm which can be utilized to make the proposed solution feasible. We perform several experiments to show the viability of our approach and the potential performance gain of using the most suitable network configuration for a given parallel program. We experiment on several case studies, evaluate different algorithms, developed for meshes or trees, and map them on ''grid''-like or reconfigurable physical interconnection networks. Our results clearly show that, based on the underlying network, different mappings are suitable for different algorithms. Even for a single algorithm different mappings are more appropriate, when the processing data size, the number of utilized nodes or the hardware cost of the processing elements changes. The implication of the above is that changing interconnection topologies/mappings (dynamically) on demand depending on the program needs can be beneficial.