Dynamic SMP Clusters with Communication on the Fly in NoC Technology for Very Fine Grain Computations

Authors:
Marek Tudruj;Lukasz Masko
Affiliations:
Polish Academy of Sciences and Polish-Japanese Institute of Information Technology;Polish Academy of Sciences
Venue:
ISPDC '04 Proceedings of the Third International Symposium on Parallel and Distributed Computing/Third International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Networks
Year:
2004

Citing 0
Cited 7

Multi-CMP system with data communication on the fly

The Journal of Supercomputing
Dynamic SMP clusters in soc technology – towards massively parallel fine grain numerics

PPAM'05 Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics
Scheduling moldable tasks for dynamic SMP clusters in soc technology

PPAM'05 Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics
Scheduling architecture---supported regions in parallel programs

PARA'10 Proceedings of the 10th international conference on Applied Parallel and Scientific Computing - Volume Part I
Data transfers on the fly for hierarchical systems of chip multi-processors

PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part I
Scheduling parallel programs based on architecture: supported regions

PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part II
Parallel matrix multiplication based on dynamic SMP clusters in SoC technology

ISPA'07 Proceedings of the 2007 international conference on Frontiers of High Performance Computing and Networking

Quantified Score

Hi-index	0.00

Visualization

Abstract

The paper presents a new architecture for systems based on run-time reconfigured shared memory processor clusters meant for implementation using network on chip technology. Clusters constitute local data exchange sub-networks, which dynamically connect processors with shared memory modules. The sub-networks enable exposure of data from one processor's data cache for reading by other processors to their data caches. This inter-processor data exchange paradigm, called "communication on the fly", enables direct communication between processor data caches. Dual-ported data caches are assumed to enable parallel reading and writing data between the caches and memory modules. In the proposed architecture, programs are executed according to a cache-controlled macro data flow execution model. Computational tasks are so defined, as to eliminate re-loading of data caches during task execution. A special program macro-data flow graph representation enables modeling of program behaviour for different architectural and program structure assumptions. Simulation results of symbolic execution of program graphs of matrix multiplication are presented in the paper. They show suitability of the proposed architecture for very fine grain parallel computations.