A parallel numerical solver using hierarchically tiled arrays

Authors:
James C. Brodman;G. Carl Evans;Murat Manguoglu;Ahmed Sameh;María J. Garzarán;David Padua
Affiliations:
University of Illinois at Urbana-Champaign, Dept. of Computer Science;University of Illinois at Urbana-Champaign, Dept. of Computer Science;Purdue University, Dept. of Computer Science;Purdue University, Dept. of Computer Science;University of Illinois at Urbana-Champaign, Dept. of Computer Science;University of Illinois at Urbana-Champaign, Dept. of Computer Science
Venue:
LCPC'10 Proceedings of the 23rd international conference on Languages and compilers for parallel computing
Year:
2010

Citing 15
Cited 1

Supernode partitioning

POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
A data locality optimizing algorithm

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Tiling multidimensional iteration spaces for nonshared memory machines

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
An overview of High Performance Fortran

ACM SIGPLAN Fortran Forum
Co-array Fortran for parallel programming

ACM SIGPLAN Fortran Forum
Organizing matrices and matrix operations for paged memory systems

Communications of the ACM
X10: an object-oriented approach to non-uniform cluster computing

OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Programming for parallelism and locality with hierarchically tiled arrays

Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
A parallel hybrid banded system solver: the SPIKE algorithm

Parallel Computing - Parallel matrix algorithms and applications (PMAA'04)
Sequoia: programming the memory hierarchy

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Sequoia: programming the memory hierarchy

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Parallel Programmability and the Chapel Language

International Journal of High Performance Computing Applications
On the Performance Enhancement of Paging Systems Through Program Analysis and Transformations

IEEE Transactions on Computers
Programming with tiles

Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Task-Parallel versus Data-Parallel Library-Based Programming in Multicore Systems

PDP '09 Proceedings of the 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing

Optimization techniques for efficient HTA programs

Parallel Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Solving linear systems is an important problem for scientific computing. Exploiting parallelism is essential for solving complex systems, and this traditionally involves writing parallel algorithms on top of a library such as MPI. The SPIKE family of algorithms is one well-known example of a parallel solver for linear systems. The Hierarchically Tiled Array data type extends traditional data-parallel array operations with explicit tiling and allows programmers to directly manipulate tiles. The tiles of the HTA data type map naturally to the block nature of many numeric computations, including the SPIKE family of algorithms. The higher level of abstraction of the HTA enables the same program to be portable across different platforms. Current implementations target both shared-memory and distributed-memory models. In this paper we present a proof-of-concept for portable linear solvers. We implement two algorithms from the SPIKE family using the HTA library. We show that our implementations of SPIKE exploit the abstractions provided by the HTA to produce a compact, clean code that can run on both shared-memory and distributed-memory models without modification. We discuss how we map the algorithms to HTA programs as well as examine their performance. We compare the performance of our HTA codes to comparable codes written in MPI as well as current state-of-the-art linear algebra routines.