A unified compiler algorithm for optimizing locality, parallelism and communication in out-of-core computations

Authors:
M. Kandemir;A. Choudhary;J. Ramanujam;M. Kandaswamy
Affiliations:
Department of Electrical Engineering and Computer Science, Syracuse University, Syracuse, NY;Department of Electrical and Computer Engineering, Northwestern University, Evanston, IL;Department of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA;Department of Electrical Engineering and Computer Science, Syracuse University, Syracuse, NY
Venue:
Proceedings of the fifth workshop on I/O in parallel and distributed systems
Year:
1997

Citing 24
Cited 5

Strategies for cache and local memory management by global program transformation

Journal of Parallel and Distributed Computing - Special Issue on Languages, Compilers and environments for Parallel Programming
A data locality optimizing algorithm

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Compiling Fortran D for MIMD distributed-memory machines

Communications of the ACM
Non-unimodular transformations of nested loops

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Global optimizations for parallelism and locality on scalable parallel machines

PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Access normalization: loop restructuring for NUMA computers

ACM Transactions on Computer Systems (TOCS)
The high performance Fortran handbook

The high performance Fortran handbook
Compiling for numa parallel machines

Compiling for numa parallel machines
A model and compilation strategy for out-of-core data parallel programs

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Data and computation transformations for multiprocessors

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Beyond unimodular transformations

The Journal of Supercomputing
Evaluating the impact of advanced memory systems on compiler-parallelized codes

PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Improving data locality with loop transformations

ACM Transactions on Programming Languages and Systems (TOPLAS)
Automatic optimization of communication in compiling out-of-core stencil codes

ICS '96 Proceedings of the 10th international conference on Supercomputing
Automatic compiler-inserted I/O prefetching for out-of-core applications

OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
Automatic data layout for distributed memory machines

Automatic data layout for distributed memory machines
Organizing matrices and matrix operations for paged memory systems

Communications of the ACM
High Performance Compilers for Parallel Computing

High Performance Compilers for Parallel Computing
A Loop Transformation Theory and an Algorithm to Maximize Parallelism

IEEE Transactions on Parallel and Distributed Systems
Data Access Reorganizations in Compiling Out-of-Core Data Parallel Programs on Distributed Memory Machines

IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Characterizing parallel file-access patterns on a large-scale multiprocessor

IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
ViC*: A Preprocessor for Virtual-Memory C*

ViC*: A Preprocessor for Virtual-Memory C*
Unifying Data and Control Transformations for Distributed Shared Memory Machines

Unifying Data and Control Transformations for Distributed Shared Memory Machines
Techniques for compiling i/o intensive parallel programs

Techniques for compiling i/o intensive parallel programs

A Unified Framework for Optimizing Locality, Parallelism, and Communication in Out-of-Core Computations

IEEE Transactions on Parallel and Distributed Systems
Compiler-Directed Collective-I/O

IEEE Transactions on Parallel and Distributed Systems
An Experimental Evaluation of I/O Optimizations on Different Applications

IEEE Transactions on Parallel and Distributed Systems
An Experimental Evaluation of I/O Optimizations on Different Applications

IEEE Transactions on Parallel and Distributed Systems
A Compiler-Guided Approach for Reducing Disk Power Consumption by Exploiting Disk Access Locality

Proceedings of the International Symposium on Code Generation and Optimization

Quantified Score

Hi-index	0.00

A unified compiler algorithm for optimizing locality, parallelism and communication in out-of-core computations

Quantified Score

Visualization

Abstract