Dynamic SMP Clusters with Communication on the Fly in SoC Technology Applied for Medium-Grain Parallel Matrix Multiplication

  • Authors:
  • M. Tudruj;L. Masko

  • Affiliations:
  • Polish-Japanese Institute of Information Technology, Warsaw, Poland;Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland

  • Venue:
  • PDP '07 Proceedings of the 15th Euromicro International Conference on Parallel, Distributed and Network-Based Processing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The paper presents a study of medium and coarse grain numerical computations in a new cluster-based shared memory parallel architecture oriented into implementation in "Systems on Chip" (SoC) technology. The assumed architecture is based on dynamic processor clusters, organized around shared memory modules. Fast shared data transfers between processors from different clusters are performed through communication on the fly, which is a synergy of processor switching between clusters and intracluster data reads on the fly. Dynamic processor clusters are implemented inside SoC modules additionally connected by a global inter-cluster network. The paper discusses speedup and parallelization efficiency of parallel matrix multiplication estimated by symbolic execution of program graphs. Simulation results are presented for algorithms with two kinds of data decomposition: recursive division of matrices into quadrants and division of matrices into stripes. In the quadrant-based method, elementary square sub-matrix multiplications are performed using the serial Strassen method and the communication on the fly is applied. The experiments reveal much higher efficiency of the proposed quadrant-based matrix multiplication method than that of the "stripe” method, considered very efficient in conventional parallel shared memory systems.