Hypergraph partitioning for automatic memory hierarchy management

Authors:
Sriram Krishnamoorthy;Umit Catalyurek;Jarek Nieplocha;Atanas Rountev;P. Sadayappan
Affiliations:
The Ohio State University;The Ohio State University;The Ohio State University;The Ohio State University;The Ohio State University
Venue:
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Year:
2006

Citing 16
Cited 8

CHARM++: a portable concurrent object oriented system based on C++

OOPSLA '93 Proceedings of the eighth annual conference on Object-oriented programming systems, languages, and applications
MOB forms: a class of multilevel block algorithms for dense linear algebra operations

ICS '94 Proceedings of the 8th international conference on Supercomputing
A manual for the CHAOS runtime library

A manual for the CHAOS runtime library
Data-centric multi-level blocking

Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Multilevel hypergraph partitioning: application in VLSI domain

DAC '97 Proceedings of the 34th annual Design Automation Conference
Level 3 basic linear algebra subprograms for sparse matrices: a user-level interface

ACM Transactions on Mathematical Software (TOMS)
Maximizing parallelism and minimizing synchronization with affine partitions

Parallel Computing - Special issues on languages and compilers for parallel computers
Hypergraph-Partitioning-Based Decomposition for Parallel Sparse-Matrix Vector Multiplication

IEEE Transactions on Parallel and Distributed Systems
Synthesizing transformations for locality enhancement of imperfectly-nested loop nests

Proceedings of the 14th international conference on Supercomputing
Blocking and array contraction across arbitrarily nested loops using affine partitioning

PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
Decomposing Irregularly Sparse Matrices for Parallel Matrix-Vector Multiplication

IRREGULAR '96 Proceedings of the Third International Workshop on Parallel Algorithms for Irregularly Structured Problems
A high-level approach to synthesis of high-performance codes for quantum chemistry

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Cilk: efficient multithreaded computing

Cilk: efficient multithreaded computing
Integrated Loop Optimizations for Data Locality Enhancement of Tensor Contraction Expressions

SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
A hypergraph partitioning based approach for scheduling of tasks with batch-shared I/O

CCGRID '05 Proceedings of the Fifth IEEE International Symposium on Cluster Computing and the Grid (CCGrid'05) - Volume 2 - Volume 02
An extensible global address space framework with decoupled task and data abstractions

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing

Data exploration of turbulence simulations using a database cluster

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Integrated Data and Task Management for Scientific Applications

ICCS '08 Proceedings of the 8th international conference on Computational Science, Part I
Scalable work stealing

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Modeling Network Transition Constraints with Hypergraphs

Transportation Science
Modeling Network Transition Constraints with Hypergraphs

Transportation Science
On Two-Dimensional Sparse Matrix Partitioning: Models, Methods, and a Recipe

SIAM Journal on Scientific Computing
Fault oblivious eXascale whitepaper

Proceedings of the 1st International Workshop on Runtime and Operating Systems for Supercomputers
Inspector/executor load balancing algorithms for block-sparse tensor contractions

Proceedings of the 27th international ACM conference on International conference on supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present a mechanism for automatic management of the memory hierarchy, including secondary storage, in the context of a global address space parallel programming framework. The programmer specifies the parallelism and locality in the computation. The scheduling of the computation into stages, together with the movement of the associated data between secondary storage and global memory, and between global memory and local memory, is automatically managed. A novel formulation of hypergraph partitioning is used to model the optimization problem of minimizing disk I/O. Experimental evaluation of the proposed approach using a sub-computation from the quantum chemistry domain shows a reduction in the disk I/O cost by up to a factor of 11, and a reduction in turnaround time by up to 49%, as compared to alternative approaches used in state-of-the-art quantum chemistry codes.