Hierarchical Diagonal Blocking and Precision Reduction Applied to Combinatorial Multigrid

Authors:
Guy E. Blelloch;Ioannis Koutis;Gary L. Miller;Kanat Tangwongsan
Affiliations:
-;-;-;-
Venue:
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Year:
2010

Citing 33
Cited 2

Computing the block triangular form of a sparse matrix

ACM Transactions on Mathematical Software (TOMS)
A unified geometric approach to graph separators

SFCS '91 Proceedings of the 32nd annual symposium on Foundations of computer science
Iterative solution methods

Iterative solution methods
A multilevel algorithm for partitioning graphs

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Improving the memory-system performance of sparse-matrix vector multiplication

IBM Journal of Research and Development
A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs

SIAM Journal on Scientific Computing
A multigrid tutorial: second edition

A multigrid tutorial: second edition
Multigrid

Multigrid
Computer Solution of Large Sparse Positive Definite

Computer Solution of Large Sparse Positive Definite
Effects of Ordering Strategies and Programming Paradigms on Sparse Matrix Computations

SIAM Review
Compact representations of separable graphs

SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Cache-Oblivious Algorithms

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Support Theory for Preconditioning

SIAM Journal on Matrix Analysis and Applications
Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems

STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
Effectively sharing a cache among threads

Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
Sparsity: Optimization Framework for Sparse Matrix Kernels

International Journal of High Performance Computing Applications
Graph Partitioning by Spectral Rounding: Applications in Image Segmentation and Clustering

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 1
Random Walks for Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Accelerating sparse matrix computations via data compression

Proceedings of the 20th annual international conference on Supercomputing
Combinatorial and algebraic tools for optimal multilevel algorithms

Combinatorial and algebraic tools for optimal multilevel algorithms
Provably good multicore cache performance for divide-and-conquer algorithms

Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Analysis of Aggregation-Based Multigrid

SIAM Journal on Scientific Computing
Optimization of sparse matrix-vector multiplication on emerging multicore platforms

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Faster approximate lossy generalized flow via interior point algorithms

STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
Using Mixed Precision for Sparse Matrix Computations to Enhance the Performance while Achieving 64-bit Accuracy

ACM Transactions on Mathematical Software (TOMS)
Graph partitioning into isolated, high conductance clusters: theory, computation and applications to preconditioning

Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
A Lattice-Preserving Multigrid Method for Solving the Inhomogeneous Poisson Equations Used in Image Analysis

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part II
Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks

Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures
Minimizing communication in sparse matrix solvers

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Combinatorial Preconditioners for Scalar Elliptic Finite-Element Problems

SIAM Journal on Matrix Analysis and Applications
Combinatorial Preconditioners and Multilevel Solvers for Problems in Computer Vision and Image Processing

ISVC '09 Proceedings of the 5th International Symposium on Advances in Visual Computing: Part I
Low depth cache-oblivious algorithms

Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Approaching Optimality for Solving SDD Linear Systems

FOCS '10 Proceedings of the 2010 IEEE 51st Annual Symposium on Foundations of Computer Science

A breakthrough in algorithm design

Communications of the ACM
Sparse matrix-vector multiply on the HICAMP architecture

Proceedings of the 26th ACM international conference on Supercomputing

Quantified Score

Hi-index	0.02

Visualization

Abstract

Memory bandwidth is a major limiting factor in the scalability of parallel iterative algorithms that rely on sparse matrix-vector multiplication (SpMV). This paper introduces Hierarchical Diagonal Blocking (HDB), an approach which we believe captures many of the existing optimization techniques for SpMV in a common representation. Using this representation in conjuction with precision-reduction techniques, we develop and evaluate high-performance SpMV kernels. We also study the implications of using our SpMV kernels in a complete iterative solver. Our method of choice is a Combinatorial Multigrid solver that can fully utilize our fastest reduced-precision SpMV kernel without sacrificing the quality of the solution. We provide extensive empirical evaluation of the effectiveness of the approach on a variety of benchmark matrices, demonstrating substantial speedups on all matrices considered.