The cache performance and optimizations of blocked algorithms
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
A data locality optimizing algorithm
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Design and evaluation of a compiler algorithm for prefetching
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Non-unimodular transformations of nested loops
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Compiler and runtime support for out-of-core HPF programs
ICS '94 Proceedings of the 8th international conference on Supercomputing
Compiling for numa parallel machines
Compiling for numa parallel machines
Improving data locality with loop transformations
ACM Transactions on Programming Languages and Systems (TOPLAS)
Automatic compiler-inserted I/O prefetching for out-of-core applications
OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
Automatic data layout for distributed memory machines
Automatic data layout for distributed memory machines
Compilation and communication strategies for out-of-core programs on distributed memory machines
Journal of Parallel and Distributed Computing - Special issue on compilation techniques for distributed memory systems
High Performance Compilers for Parallel Computing
High Performance Compilers for Parallel Computing
ViC*: A Preprocessor for Virtual-Memory C*
ViC*: A Preprocessor for Virtual-Memory C*
Unifying Data and Control Transformations for Distributed Shared Memory Machines
Unifying Data and Control Transformations for Distributed Shared Memory Machines
On the Performance Enhancement of Paging Systems Through Program Analysis and Transformations
IEEE Transactions on Computers
Compiling object-oriented data intensive applications
Proceedings of the 14th international conference on Supercomputing
IEEE Transactions on Parallel and Distributed Systems
An Experimental Evaluation of I/O Optimizations on Different Applications
IEEE Transactions on Parallel and Distributed Systems
Data parallel language and compiler support for data intensive applications
Parallel Computing - Parallel data-intensive algorithms and applications
An Experimental Evaluation of I/O Optimizations on Different Applications
IEEE Transactions on Parallel and Distributed Systems
Compiling Data Intensive Applications with Spatial Coordinates
LCPC '00 Proceedings of the 13th International Workshop on Languages and Compilers for Parallel Computing-Revised Papers
The MHETA Execution Model for Heterogeneous Clusters
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Compiler and middleware support for scalable data mining
LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
Hi-index | 0.00 |
The difficulty of handling out-of-core data limits the potential of parallel machines and high-end supercomputers. Since writing an efficient out-of-core version of a program is a difficult task and since virtual memory systems do not perform well on scientific computations, we believe that there is a clear need for compiler-directed explicit I/O approach for out-of-core computations. In this paper, we present a compiler algorithm to optimize locality of disk accesses in out-of-core codes by choosing a good combination of file layouts on disks and loop transformations. The transformations change the access order of array data. Experimental results obtained on IBM SP-2 and Intel Paragon provide encouraging evidence that our approach is successful at optimizing programs which depend on disk-resident data in distributed-memory machines.