A fine-grain load-adaptive algorithm of the 2D discrete wavelet transform for multithreaded architectures

  • Authors:
  • Parimala Thulasiraman;Ashfaq A. Khokhar;Gerd Heber;Guang R. Gao

  • Affiliations:
  • Department of Computer Science, University of Manitoba, Winnipeg, Manitoba R3T 2N2, Canada;Department of EECS, University of Illinois at Chicago, Chicago, IL;Cornell Theory Center, Cornell University, 638 Rhodes Hall, Ithaca, NY;Department of ECE, University of Delaware, Newark, DE

  • Venue:
  • Journal of Parallel and Distributed Computing
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we develop a load-adaptive multithreaded algorithm to compute 2D Discrete Wavelet Transform (DWT) and its implementation on a fine-grain multithreading platform. In a 2D DWT computation, the problem sizes reduces at every decomposition level and the length of the emerging computation paths also vary. The parallel algorithm proposed in this paper, dynamically scales itself to the varying problem size. During any iteration, the ratio of the number of local threads to the number of remote threads issued by a processor can be adjusted to be greater than 1 by controlling the algorithm parameters. This approach provides an opportunity to interleave computation and communication without explicitly introducing idle cycles on waiting for the remote threads to finish. Experimental results are reported based on the implementations of the proposed algorithm on a 20 node emulated multithreaded platform, EARTH-MANNA, specifically designed for fine-grain multithreaded paradigms. We show that multithreading implementations of the proposed algorithm are at least 2 times faster than the MPI-based message passing implementations reported in the literature, assuming the same processor speed. We further show that the proposed algorithm and implementations scale linearly with respect to problem and machine sizes.