Dynamic load balancing for distributed memory multiprocessors
Journal of Parallel and Distributed Computing
Partitioning sparse matrices with eigenvectors of graphs
SIAM Journal on Matrix Analysis and Applications
Load balancing and Poisson equation in a graph
Concurrency: Practice and Experience
Towards a fast implementation of spectral nested dissection
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
A parallel bottom-up clustering algorithm with applications to circuit partitioning in VLSI design
DAC '93 Proceedings of the 30th international Design Automation Conference
A new procedure for dynamic adaption of three-dimensional unstructured grids
Applied Numerical Mathematics
An improved spectral graph partitioning algorithm for mapping parallel computations
SIAM Journal on Scientific Computing
The generalized dimension exchange method for load balancing in k-ary n-cubes and variants
Journal of Parallel and Distributed Computing
A multilevel algorithm for partitioning graphs
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Fast and parallel mapping algorithms for irregular problems
The Journal of Supercomputing
Fast and effective algorithms for graph partitioning and sparse-matrix ordering
IBM Journal of Research and Development - Special issue: optical lithography I
HARP: a fast spectral partitioner
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Parallel dynamic graph partitioning for adaptive unstructured meshes
Journal of Parallel and Distributed Computing - Special issue on dynamic load balancing
Multilevel diffusion schemes for repartitioning of adaptive meshes
Journal of Parallel and Distributed Computing - Special issue on dynamic load balancing
Parallel structures and dynamic load balancing for adaptive finite element computation
Proceedings of international centre for mathematical sciences on Grid adaptation in computational PDES : theory and applications: theory and applications
Multilevel k-way partitioning scheme for irregular graphs
Journal of Parallel and Distributed Computing
PLUM: parallel load balancing for adaptive unstructured meshes
Journal of Parallel and Distributed Computing
A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs
SIAM Journal on Scientific Computing
An improved diffusion algorithm for dynamic load balancing
Parallel Computing
Data structures for weighted matching and nearest common ancestors with linking
SODA '90 Proceedings of the first annual ACM-SIAM symposium on Discrete algorithms
Efficient schemes for nearest neighbor load balancing
Parallel Computing - Special issue on parallelization techniques for numerical modelling
Design of dynamic load-balancing tools for parallel applications
Proceedings of the 14th international conference on Supercomputing
A comparison of some dynamic load-balancing algorithms for a parallel adaptive flow solver
Parallel Computing - Special issue on graph partioning and parallel computing
Mesh Partitioning: A Multilevel Balancing and Refinement Algorithm
SIAM Journal on Scientific Computing
Parallel incremental graph partitioning using linear programming
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
A Practical Approach to Dynamic Load Balancing
IEEE Transactions on Parallel and Distributed Systems
An evaluation of bipartitioning techniques
ARVLSI '95 Proceedings of the 16th Conference on Advanced Research in VLSI (ARVLSI'95)
A linear-time heuristic for improving network partitions
DAC '82 Proceedings of the 19th Design Automation Conference
SPDP '96 Proceedings of the 8th IEEE Symposium on Parallel and Distributed Processing (SPDP '96)
Repartitioning Unstructured Adaptive Meshes
IPDPS '00 Proceedings of the 14th International Symposium on Parallel and Distributed Processing
Graph partitioning for emerging scientific simulations
Graph partitioning for emerging scientific simulations
Multilevel algorithms for generating coarse grids for multigrid methods
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
A Parallel Multilevel Metaheuristic for Graph Partitioning
Journal of Heuristics
IEEE Transactions on Parallel and Distributed Systems
Journal of Parallel and Distributed Computing
Time and space adaptation for computational grids with the ATOP-Grid middleware
Future Generation Computer Systems
A repartitioning hypergraph model for dynamic load balancing
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
Current multilevel repartitioning schemes tend to perform well on certain types of problems while obtaining worse results for other types of problems. We present two new multilevel algorithms for repartitioning adaptive meshes that improve the performance of multilevel schemes for the types of problems that current schemes perform poorly while maintaining similar or better results for those problems that current schemes perform well. Specifically, we present a new scratch-remap scheme called Locally-matched Multilevel Scratch-remap (or simply LMSR) for repartitioning of adaptive meshes. LMSR tries to compute a high-quality partitioning that has a large amount of overlap with the original partitioning. We show that LMSR generally decreases the data redistribution costs required to balance the load compared to current scratch-remap schemes. We present a new diffusion-based scheme that we refer to as Wavefront Diffusion. In Wavefront Diffusion, the flow of vertices moves in a wavefront from overweight to underweight subdomains. We show that Wavefront Diffusion obtains significantly lower data redistribution costs while maintaining similar or better edge-cut results compared to existing diffusion algorithms. We also compare Wavefront Diffusion with LMSR and show that these provide a trade-off between edge-cut and data redistribution costs for a wide range of problems. Our experimental results on a Cray T3E, an IBM SP2, and a cluster of Pentium Pro workstations show that both schemes are fast and scalable. For example, both are capable of repartitioning a seven million vertex graph in under three seconds on 128 processors of a Cray T3E. Our schemes obtained relative speedups of between nine and 12 when the number of processors was increased by a factor of 16 on a Cray T3E.