Matrix Transpose on 2D Torus Array Processor

  • Authors:
  • Ahmed S. Zekri;Stanislav G. Sedukhin

  • Affiliations:
  • The University of Aizu, Japan;The University of Aizu, Japan

  • Venue:
  • CIT '06 Proceedings of the Sixth IEEE International Conference on Computer and Information Technology
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Previously, we represented the index space of the (n脳n)- matrix multiply-add problem C=C+A脳B as a 3D torus, where A, B, and C are rolled along the corresponding axes of the index space. All optimal 2D data allocations (resulted from projection) to solve the problem on the n脳n torus array processor in n multiply-add-roll steps were obtained. In this paper, we formulate the operations needed for aligning both the data before computing and the results after computing as matrix multiply-add problems. These alignment operations are combined with the optimal data allocations that solve the matrix multiply-add problem to propose new algorithms to transpose an n脳n matrix on the n脳n torus array processor in O(n) multiply-add-roll steps. Using the proposed algorithms, we showed different approaches to solve the transposed matrix multiply-add problem, C=C+A^T脳B^T , on the 2D torus array processor.