Large Matrix Multiplication on a Novel Heterogeneous Parallel DSP Architecture

  • Authors:
  • Joar Sohl;Jian Wang;Dake Liu

  • Affiliations:
  • Department of Electrical Engineering, Linköping University, Linköping, Sweden 581 83;Department of Electrical Engineering, Linköping University, Linköping, Sweden 581 83;Department of Electrical Engineering, Linköping University, Linköping, Sweden 581 83

  • Venue:
  • APPT '09 Proceedings of the 8th International Symposium on Advanced Parallel Processing Technologies
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper introduces a novel master-multi-SIMD on-chip multi-core architecture for embedded signal processing. The parallel architecture and its memory subsystem are described in this paper. We evaluate the large size matrix multiplication performance on this parallel architecture and compare it with a SIMD-extended data parallel architecture. We also examine how well the new architecture scales for different numbers of SIMD co-processors. The experimental results show that the ePUMA architecture's memory subsystem can effectively hide the data access overhead. With its 8-way SIMD data path and multi-SIMD parallel execution, the ePUMA architecture improves the performance of matrix multiplication with a speedup of 45x from the conventional SIMD extension.