Partitioning Unstructured Computational Graphs for Nonuniform and Adaptive Environments

Authors:
Maher Kaddoura;Chao-Wei Ou;Sanjay Ranka
Affiliations:
-;-;-
Venue:
IEEE Parallel & Distributed Technology: Systems & Technology
Year:
1995

Citing 3
Cited 11

Spatial query processing in an object-oriented database system

SIGMOD '86 Proceedings of the 1986 ACM SIGMOD international conference on Management of data
Partitioning sparse matrices with eigenvectors of graphs

SIAM Journal on Matrix Analysis and Applications
Performance of dynamic load balancing algorithms for unstructured mesh calculations

Concurrency: Practice and Experience

Hypergraph-Partitioning-Based Decomposition for Parallel Sparse-Matrix Vector Multiplication

IEEE Transactions on Parallel and Distributed Systems
A hypergraph-partitioning approach for coarse-grain decomposition

Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Atomistic Simulation of Nanostructured Materials

IEEE Computational Science & Engineering
Analysis of the Clustering Properties of the Hilbert Space-Filling Curve

IEEE Transactions on Knowledge and Data Engineering
Collection-Aware Optimum Sequencing of Operations and Closed-Form Solutions for the Distribution of a Divisible Load on Arbitrary Processor Trees

IEEE Transactions on Parallel and Distributed Systems
Hierarchical Vertex Ordering

ICGT '02 Proceedings of the First International Conference on Graph Transformation
I/O Requirements of Scientific Applications: An Evolutionary View

HPDC '96 Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing
Compact Hilbert indices: Space-filling curves for domains with unequal side lengths

Information Processing Letters
A domain decomposition strategy for alignment of multiple biological sequences on multiprocessor platforms

Journal of Parallel and Distributed Computing
On Two-Dimensional Sparse Matrix Partitioning: Models, Methods, and a Recipe

SIAM Journal on Scientific Computing
An execution strategy and optimized runtime support for parallelizing irregular reductions on modern GPUs

Proceedings of the international conference on Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In heterogeneous computing environments, computational resources can have a nonuniform distribution that changes over time. To execute in such an environment, many irregular and loosely synchronous data-parallel applications must be carefully mapped. This article examines algorithms that provide this mapping by efficiently partitioning the computational graphs of these applications.Heterogeneity has become commonplace in high-performance computing environments. In the future most computing environments will consist of a cluster of nodes connected by a high-speed interconnection network. Node architectures will include high-performance SIMD and MIMD parallel computers as well as numerous high-performance workstations.In a heterogeneous environment, users can pool many computational resources to create a large virtual machine. This environment can be nonuniform -- that is, the machines or processors can have different computational powers. However, the pool of resources might change over the computation's lifetime because of machine failures or differing use patterns. It should be possible to add or remove resources without significantly affecting the other machines or changing the existing software. In such an adaptive environment, an individual machine could either be dedicated to a single user's computation or shared by users. The former strategy has the advantage that each machine has static computing capability, while the latter has the advantage of a higher rate of use.In this article we'll examine the mapping requirements for the parallelization of a large class of irregular and loosely synchronous data-parallel applications on nonuniform and adaptive environments. The computational structure of these applications can be described as a computational graph. In such a graph, nodes represent computational tasks and edges describe the communication between tasks.For many applications, the graph's vertices correspond to 2D and 3D coordinates, and the interaction between computations is limited to physically proximate vertices. Recursive coordinate bisection, index-based mapping, and recursive spectral bisection can exploit these properties to partition such applications. Essentially, these algorithms cluster proximate points together to form a partition such that the numbers of vertices attached to every partition are equal.Other researchers have used these algorithms to map graphs onto uniform parallel machines. We'll evaluate how the algorithms partition computational graphs on a simulation of a cluster of machines constituting a static, nonuniform environment. (In a static environment, computational resources are fixed throughout the completion of all tasks.) The algorithms assume that an interconnection network connects all the processors and that the cost of unit communication is the same between all the processors. (A bus is an example of such a network.) Although our algorithms specifically target a network-connected cluster of workstations, the issues are similar for parallelizing such applications on a network of machines.We'll also show how to use or extend these algorithms for an adaptive environment. Mapping graph vertices onto a 1D space can facilitate extremely fast remapping when the environment changes. This simple remapping achieves acceptable partitioning, though poorer than with mapping from scratch.