Accelerating the Dynamic Programming for the Matrix Chain Product on the GPU

  • Authors:
  • Kazufumi Nishida;Yasuaki Ito;Koji Nakano

  • Affiliations:
  • -;-;-

  • Venue:
  • ICNC '11 Proceedings of the 2011 Second International Conference on Networking and Computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Modern GPUs (Graphics Processing Units) can be used for general purpose parallel computation. Users can develop parallel programs running on GPUs using programming architecture called CUDA (Compute Unified Device Architecture). The Matrix Chain Product Problem is an optimization problem for finding parentheses of the matrix chain that gives the minimum total number of multiplications necessary to compute the product of the matrix chain. It is well known that this problem can be solved using the dynamic programming technique in $O(n^3)$ time using tables of size $O(n^2)$. The main contribution of this paper is to present an efficient parallel implementation of this $O(n^3)$-time algorithm on the GPU. In our implementation, we have considered the architecture and programming issues of the GPU system. The experimental results show that, for a chain of 16384 matrices generated at random, our implementation in the Nvidia GeForce GTX 480 achieves a speedup factor of 40 over a conventional CPU implementation.