Task scheduling using a block dependency DAG for block-oriented sparse Cholesky factorization

Authors:
Heejo Lee;Jong Kim;Sung Je Hong;Sunggu Lee
Affiliations:
Ahnlab, Inc., 8F V-Valley Bldg., 724 Suseo-dong, Gangnam-gu Seoul 135-744, South Korea;Department of Computer Science and Engineering, Pohang University of Science and Technology, Pohang 790-784, South Korea;Department of Computer Science and Engineering, Pohang University of Science and Technology, Pohang 790-784, South Korea;Department of Electrical Engineering, Pohang University of Science and Technology, Pohang 790-784, South Korea
Venue:
Parallel Computing
Year:
2003

Citing 27
Cited 0

Sparse Cholesky factorization on a local-memory multiprocessor

SIAM Journal on Scientific and Statistical Computing
Introduction to Parallel & Vector Solution of Linear Systems

Introduction to Parallel & Vector Solution of Linear Systems
Sparse matrix test problems

ACM Transactions on Mathematical Software (TOMS)
The influence of relaxed supernode partitions on the multifrontal method

ACM Transactions on Mathematical Software (TOMS)
A set of level 3 basic linear algebra subprograms

ACM Transactions on Mathematical Software (TOMS)
Limiting communication in parallel sparse Cholesky factorization

SIAM Journal on Scientific and Statistical Computing
Parallel algorithms for sparse linear systems

SIAM Review
PYRROS: static task scheduling and code generation for message passing multiprocessors

ICS '92 Proceedings of the 6th international conference on Supercomputing
A supernodal Cholesky factorization algorithm for shared-memory multiprocessors

SIAM Journal on Scientific Computing
Block sparse Cholesky algorithms on advanced uniprocessor computers

SIAM Journal on Scientific Computing
An efficient block-oriented approach to parallel sparse Cholesky factorization

SIAM Journal on Scientific Computing
Task Clustering and Scheduling for Distributed Memory Parallel Architectures

IEEE Transactions on Parallel and Distributed Systems
Dynamic Critical-Path Scheduling: An Effective Technique for Allocating Task Graphs to Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
Performance of Panel and Block Approaches to Sparse Cholesky Factorization on the iPSC/860 and Paragon Multicomputers

SIAM Journal on Scientific Computing
Highly Scalable Parallel Algorithms for Sparse Matrix Factorization

IEEE Transactions on Parallel and Distributed Systems
Run-time techniques for exploiting irregular task parallelism on distributed memory architectures

Journal of Parallel and Distributed Computing
Elimination forest guided 2D sparse LU factorization

Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Space/time-efficient scheduling and execution of parallel irregular computations

ACM Transactions on Programming Languages and Systems (TOPLAS)
Task scheduling using a block dependency DAG for block-oriented sparse Cholesky factorization

SAC '00 Proceedings of the 2000 ACM symposium on Applied computing - Volume 2
Partitioning and Scheduling Parallel Programs for Multiprocessors

Partitioning and Scheduling Parallel Programs for Multiprocessors
Numerical Linear Algebra for High Performance Computers

Numerical Linear Algebra for High Performance Computers
Improved load distribution in parallel sparse cholesky factorization

Proceedings of the 1994 ACM/IEEE conference on Supercomputing
On the Granularity and Clustering of Directed Acyclic Task Graphs

IEEE Transactions on Parallel and Distributed Systems
DSC: Scheduling Parallel Tasks on an Unbounded Number of Processors

IEEE Transactions on Parallel and Distributed Systems
PaStiX: A Parallel Sparse Direct Solver Based on a Static Scheduling for Mixed 1D/2D Block Distributions

IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
A Mapping and Scheduling Algorithm for Parallel Sparse Fan-In Numerical Factorization

Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
The performance impact of data reuse in parallel dense Cholesky factorization

The performance impact of data reuse in parallel dense Cholesky factorization

Quantified Score

Hi-index	0.00

Visualization

Abstract

Block-oriented sparse Cholesky factorization decomposes a sparse matrix into rectangular subblocks; each block can then be handled as a computational unit in order to increase data reuse in a hierarchical memory system. Also, the factorization method increases the degree of concurrency and reduces the overall communication volume so that it performs more efficiently on a distributed-memory multiprocessor system than the customary column-oriented factorization method. But until now, mapping of blocks to processors has been designed for load balance with restricted communication patterns. In this paper, we represent tasks using a block dependency DAG that represents the execution behavior of block sparse Cholesky factorization in a distributed-memory system. Since the characteristics of tasks for block Cholesky factorization are different from those of the conventional parallel task model, we propose a new task scheduling algorithm using a block dependency DAG. The proposed algorithm consists of two stages: early-start clustering, and affined cluster mapping (ACM). The early-start clustering stage is used to cluster tasks while preserving the earliest start time of a task without limiting parallelism. After task clustering, the ACM stage allocates clusters to processors considering both communication cost and load balance. Experimental results on a Myrinet cluster system show that the proposed task scheduling approach outperforms other processor mapping methods.