Spatial query processing in an object-oriented database system
SIGMOD '86 Proceedings of the 1986 ACM SIGMOD international conference on Management of data
Partitioning sparse matrices with eigenvectors of graphs
SIAM Journal on Matrix Analysis and Applications
Performance of dynamic load balancing algorithms for unstructured mesh calculations
Concurrency: Practice and Experience
Hypergraph-Partitioning-Based Decomposition for Parallel Sparse-Matrix Vector Multiplication
IEEE Transactions on Parallel and Distributed Systems
A hypergraph-partitioning approach for coarse-grain decomposition
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Atomistic Simulation of Nanostructured Materials
IEEE Computational Science & Engineering
Analysis of the Clustering Properties of the Hilbert Space-Filling Curve
IEEE Transactions on Knowledge and Data Engineering
IEEE Transactions on Parallel and Distributed Systems
ICGT '02 Proceedings of the First International Conference on Graph Transformation
I/O Requirements of Scientific Applications: An Evolutionary View
HPDC '96 Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing
Compact Hilbert indices: Space-filling curves for domains with unequal side lengths
Information Processing Letters
Journal of Parallel and Distributed Computing
On Two-Dimensional Sparse Matrix Partitioning: Models, Methods, and a Recipe
SIAM Journal on Scientific Computing
Proceedings of the international conference on Supercomputing
Hi-index | 0.00 |
In heterogeneous computing environments, computational resources can have a nonuniform distribution that changes over time. To execute in such an environment, many irregular and loosely synchronous data-parallel applications must be carefully mapped. This article examines algorithms that provide this mapping by efficiently partitioning the computational graphs of these applications.Heterogeneity has become commonplace in high-performance computing environments. In the future most computing environments will consist of a cluster of nodes connected by a high-speed interconnection network. Node architectures will include high-performance SIMD and MIMD parallel computers as well as numerous high-performance workstations.In a heterogeneous environment, users can pool many computational resources to create a large virtual machine. This environment can be nonuniform -- that is, the machines or processors can have different computational powers. However, the pool of resources might change over the computation's lifetime because of machine failures or differing use patterns. It should be possible to add or remove resources without significantly affecting the other machines or changing the existing software. In such an adaptive environment, an individual machine could either be dedicated to a single user's computation or shared by users. The former strategy has the advantage that each machine has static computing capability, while the latter has the advantage of a higher rate of use.In this article we'll examine the mapping requirements for the parallelization of a large class of irregular and loosely synchronous data-parallel applications on nonuniform and adaptive environments. The computational structure of these applications can be described as a computational graph. In such a graph, nodes represent computational tasks and edges describe the communication between tasks.For many applications, the graph's vertices correspond to 2D and 3D coordinates, and the interaction between computations is limited to physically proximate vertices. Recursive coordinate bisection, index-based mapping, and recursive spectral bisection can exploit these properties to partition such applications. Essentially, these algorithms cluster proximate points together to form a partition such that the numbers of vertices attached to every partition are equal.Other researchers have used these algorithms to map graphs onto uniform parallel machines. We'll evaluate how the algorithms partition computational graphs on a simulation of a cluster of machines constituting a static, nonuniform environment. (In a static environment, computational resources are fixed throughout the completion of all tasks.) The algorithms assume that an interconnection network connects all the processors and that the cost of unit communication is the same between all the processors. (A bus is an example of such a network.) Although our algorithms specifically target a network-connected cluster of workstations, the issues are similar for parallelizing such applications on a network of machines.We'll also show how to use or extend these algorithms for an adaptive environment. Mapping graph vertices onto a 1D space can facilitate extremely fast remapping when the environment changes. This simple remapping achieves acceptable partitioning, though poorer than with mapping from scratch.