A multilevel algorithm for partitioning graphs
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Multilevel hypergraph partitioning: application in VLSI domain
DAC '97 Proceedings of the 34th annual Design Automation Conference
A parallel algorithm for multilevel graph partitioning and sparse matrix ordering
Journal of Parallel and Distributed Computing
A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs
SIAM Journal on Scientific Computing
Hypergraph-Partitioning-Based Decomposition for Parallel Sparse-Matrix Vector Multiplication
IEEE Transactions on Parallel and Distributed Systems
Partitioning Rectangular and Structurally Unsymmetric Sparse Matrices for Parallel Processing
SIAM Journal on Scientific Computing
Graph partitioning models for parallel computing
Parallel Computing - Special issue on graph partioning and parallel computing
Improved algorithms for hypergraph bipartitioning
ASP-DAC '00 Proceedings of the 2000 Asia and South Pacific Design Automation Conference
Zoltan Data Management Service for Parallel Dynamic Applications
Computing in Science and Engineering
A Fine-Grain Hypergraph Model for 2D Decomposition of Sparse Matrices
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
A linear-time heuristic for improving network partitions
DAC '82 Proceedings of the 19th Design Automation Conference
Parallel Scientific Computation: A Structured Approach Using BSP and MPI
Parallel Scientific Computation: A Structured Approach Using BSP and MPI
Parallel Computing - Algorithmic skeletons
Parallel multilevel algorithms for hypergraph partitioning
Journal of Parallel and Distributed Computing
A repartitioning hypergraph model for dynamic load balancing
Journal of Parallel and Distributed Computing
A parallel preconditioning strategy for efficient transistor-level circuit simulation
Proceedings of the 2009 International Conference on Computer-Aided Design
VECPAR'06 Proceedings of the 7th international conference on High performance computing for computational science
Hypergraph-based multilevel matrix approximation for text information retrieval
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Controlling Unstructured Mesh Partitions for Massively Parallel Simulations
SIAM Journal on Scientific Computing
Hypergraph-Based Unsymmetric Nested Dissection Ordering for Sparse LU Factorization
SIAM Journal on Scientific Computing
ISCIS'06 Proceedings of the 21st international conference on Computer and Information Sciences
Unstructured mesh partition improvement for implicit finite element at extreme scale
The Journal of Supercomputing
Enabling next-generation parallel circuit simulation with trilinos
Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing
An object-oriented bulk synchronous parallel library for multicore programming
Concurrency and Computation: Practice & Experience
Partitioning Hypergraphs in Scientific Computing Applications through Vertex Separators on Graphs
SIAM Journal on Scientific Computing
PowerGraph: distributed graph-parallel computation on natural graphs
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Scientific Programming - A New Overview of the Trilinos Project --Part 1
Hi-index | 0.00 |
Graph partitioning is often used for load balancing in parallel computing, but it is known that hypergraph partitioning has several advantages. First, hypergraphs more accurately model communication volume, and second, they are more expressive and can better represent nonsymmetric problems. Hypergraph partitioning is particularly suited to parallel sparse matrix-vector multiplication, a common kernel in scientific computing. We present a parallel software package for hypergraph (and sparse matrix) partitioning developed at Sandia National Labs. The algorithm is a variation on multilevel partitioning. Our parallel implementation is novel in that it uses a two-dimensional data distribution among processors. We present empirical results that show our parallel implementation achieves good speedup on several large problems (up to 33 million nonzeros) with up to 64 processors on a Linux cluster.