Rescheduling for Locality in Sparse Matrix Computations

Authors:
Michelle Mills Strout;Larry Carter;Jeanne Ferrante
Affiliations:
-;-;-
Venue:
ICCS '01 Proceedings of the International Conference on Computational Sciences-Part I
Year:
2001

Citing 11
Cited 6

Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology

ICS '97 Proceedings of the 11th international conference on Supercomputing
Multilevel k-way partitioning scheme for irregular graphs

Journal of Parallel and Distributed Computing
Improving cache performance in dynamic applications through data and computation reorganization at run time

Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Improving memory hierarchy performance for irregular applications

ICS '99 Proceedings of the 13th international conference on Supercomputing
A Supernodal Approach to Sparse Partial Pivoting

SIAM Journal on Matrix Analysis and Applications
Efficient compiler and run-time support for parallel irregular reductions

Parallel Computing - special issue on parallel computing for irregular applications
Automatically tuned linear algebra software

SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
High Performance Compilers for Parallel Computing

High Performance Compilers for Parallel Computing
Localizing Non-Affine Array References

PACT '99 Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques
Optimizing the performance of sparse matrix-vector multiplication

Optimizing the performance of sparse matrix-vector multiplication
Guiding program transformations with modal performance models

Guiding program transformations with modal performance models

Compile-time composition of run-time data and iteration reorderings

PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Sparse Tiling for Stationary Iterative Methods

International Journal of High Performance Computing Applications
Energy-Efficient Multiprocessor Systems-on-Chip for Embedded Computing: Exploring Programming Models and Their Architectural Support

IEEE Transactions on Computers
Combining performance aspects of irregular gauss-seidel via sparse tiling

LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing
Automatically enhancing locality for tree traversals with traversal splicing

Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Exploiting domain knowledge to optimize parallel computational mechanics codes

Proceedings of the 27th international ACM conference on International conference on supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In modern computer architecture the use of memory hierarchies causes a program's data locality to directly affect performance. Data locality occurs when a piece of data is still in a cache upon reuse. For dense matrix computations, loop transformations can be used to improve data locality. However, sparse matrix computations have nonaffine loop bounds and indirect memory references which prohibit the use of compile time loop transformations. This paper describes an algorithm to tile at runtime called serial sparse tiling. We test a runtime tiled version of sparse Gauss-Seidel on 4 different architectures where it exhibits speedups of up to 2.7. The paper also gives a static model for determining tile size and outlines how overhead affects the overall speedup.