Algorithms for SMP-Clusters Dense Matrix-Vector Multiplication

  • Authors:
  • Martin Schmollinger;Michael Kaufmann

  • Affiliations:
  • -;-

  • Venue:
  • IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
  • Year:
  • 2002

Quantified Score

Hi-index 0.01

Visualization

Abstract

Clusters of symmetric multiprocessor (SMP) nodes are one of the most important parallel architectures now and in the future. The architecture consists of shared-memory nodes with multiple processors and a fast interconnection network between the nodes. New programming models try to exploit this architecture by using threads in the nodes and using message-passing-libraries for internode communication. In order to develop efficient algorithms it is necessary to consider the hybrid nature of the architecture and of the programming models. In this paper, we present a methodology for designing efficient algorithms for SMP-clusters on top of the 驴NUMA-model. The 驴NUMA-model is a computational model that extends the bulk-synchronous parallel (BSP) model with the characteristics of SMP-clusters. The 驴NUMA-methodology is a top-down method, which suggests to develop an optimal overall algorithm by developing optimal algorithms for each level in the machine hierarchy. We use the problem of dense matrix-vector-multiplication for presentation. The theoretical results of our analysis are verified practically. We show results of experiments, which were made on a Linux-cluster of dual Pentium-III nodes.