Task scheduling using a block dependency DAG for block-oriented sparse Cholesky factorization

  • Authors:
  • Heejo Lee;Jong Kim;Sung Je Hong;Sunggu Lee

  • Affiliations:
  • Ahnlab, Inc., 8F V-Valley Bldg., 724 Suseo-dong, Gangnam-gu Seoul 135-744, South Korea;Department of Computer Science and Engineering, Pohang University of Science and Technology, Pohang 790-784, South Korea;Department of Computer Science and Engineering, Pohang University of Science and Technology, Pohang 790-784, South Korea;Department of Electrical Engineering, Pohang University of Science and Technology, Pohang 790-784, South Korea

  • Venue:
  • Parallel Computing
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Block-oriented sparse Cholesky factorization decomposes a sparse matrix into rectangular subblocks; each block can then be handled as a computational unit in order to increase data reuse in a hierarchical memory system. Also, the factorization method increases the degree of concurrency and reduces the overall communication volume so that it performs more efficiently on a distributed-memory multiprocessor system than the customary column-oriented factorization method. But until now, mapping of blocks to processors has been designed for load balance with restricted communication patterns. In this paper, we represent tasks using a block dependency DAG that represents the execution behavior of block sparse Cholesky factorization in a distributed-memory system. Since the characteristics of tasks for block Cholesky factorization are different from those of the conventional parallel task model, we propose a new task scheduling algorithm using a block dependency DAG. The proposed algorithm consists of two stages: early-start clustering, and affined cluster mapping (ACM). The early-start clustering stage is used to cluster tasks while preserving the earliest start time of a task without limiting parallelism. After task clustering, the ACM stage allocates clusters to processors considering both communication cost and load balance. Experimental results on a Myrinet cluster system show that the proposed task scheduling approach outperforms other processor mapping methods.