Controlling Unstructured Mesh Partitions for Massively Parallel Simulations

Authors:
Min Zhou;Onkar Sahni;Karen D. Devine;Mark S. Shephard;Kenneth E. Jansen
Affiliations:
zhoum@scorec.rpi.edu and osahni@scorec.rpi.edu and shephard@scorec.rpi.edu;-;kddevin@sandia.gov;-;kenneth.jansen@colorado.edu
Venue:
SIAM Journal on Scientific Computing
Year:
2010

Citing 11
Cited 1

Partitioning sparse matrices with eigenvectors of graphs

SIAM Journal on Matrix Analysis and Applications
A multilevel algorithm for partitioning graphs

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs

SIAM Journal on Scientific Computing
Hypergraph-Partitioning-Based Decomposition for Parallel Sparse-Matrix Vector Multiplication

IEEE Transactions on Parallel and Distributed Systems
Quality matching and local improvement for multilevel graph-partitioning

Parallel Computing - Special issue on graph partioning and parallel computing
Multilevel algorithms for multi-constraint graph partitioning

SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Zoltan Data Management Service for Parallel Dynamic Applications

Computing in Science and Engineering
Parallel Multilevel Graph Partitioning

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Efficient distributed mesh data structure for parallel automated adaptive analysis

Engineering with Computers
Scalable implicit finite element solver for massively parallel processing with demonstration to 160K cores

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Parallel hypergraph partitioning for scientific computing

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing

Unstructured mesh partition improvement for implicit finite element at extreme scale

The Journal of Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Parallel simulations at extreme scale require that the mesh is distributed across a large number of processors with equal work load and minimum interpart communications. A number of algorithms have been developed to meet these goals, e.g., graph/hypergraph and coordinate-based methods. However, the global implementation of current approaches can fail on very large core counts, which is resolved by combining global and local partitioning using multiple parts per processor. The other limitation of graph/hypergraph-based partitioning is that it uses one type of mesh entity as graph nodes; thus, the balance of other mesh entities may not be optimal. In the case of three-dimensional (3-D) linear finite element analysis, it is common to select mesh regions (elements) as partition objects. In current examples, the regions are well balanced up to 163,840 parts for a 1.07 billion element mesh, while the vertices have an imbalance which is as high as 19.52%. Two methods are developed that work in conjunction with graph/hypergraph-based procedures to provide improved partitions. Example computations executed on an IBM Blue Gene/P system using up to 163,840 cores demonstrate the usefulness of the procedures, particularly for time-critical calculations where individual cores may be lightly loaded in terms of the number of mesh entities per core. The algorithms presented in this paper reduced the vertex imbalance from 17.8% to 4.97% for a partition with 131,072 parts and accelerated the equation solution phase of the finite element analysis by 10.4%.