FFTs in external or hierarchical memory
The Journal of Supercomputing
Efficient transposition algorithms for large matrices
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Global arrays: a nonuniform memory access programming model for high-performance computers
The Journal of Supercomputing
Multidimensional array I/O in Panda 1.0
The Journal of Supercomputing
Optimizing collective I/O performance on parallel computers: a multisystem study
ICS '97 Proceedings of the 11th international conference on Supercomputing
Space-time trade-off optimization for a class of electronic structure calculations
PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation
Global arrays: a portable "shared-memory" programming model for distributed memory computers
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
An Efficient Algorithm for Out-of-Core Matrix Transposition
IEEE Transactions on Computers
HiPC '01 Proceedings of the 8th International Conference on High Performance Computing
A high-level approach to synthesis of high-performance codes for quantum chemistry
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Disk Resident Arrays: An Array-Oriented I/O Library for Out-Of-Core Computations
FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
Global Communication Optimization for Tensor Contraction Expressions under Memory Constraints
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
A Fast Computer Method for Matrix Transposing
IEEE Transactions on Computers
Out-of-core and compressed level set methods
ACM Transactions on Graphics (TOG)
Hi-index | 0.00 |
I/O libraries such as PANDA and DRA use blocked layouts for efficient access to disk-resident multi-dimensional arrays, with the shape of the blocks being chosen to match the expected access pattern of the array Sometimes, different applications, or different phases of the same application, have very different access patterns for an array In such situations, an array's blocked layout representation must be transformed for efficient access In this paper, we describe a new approach to solve the layout transformation problem and demonstrate its effectiveness in the context of the Disk Resident Arrays (DRA) library The approach handles re-blocking and permutation of dimensions Results are provided that demonstrate the performance benefit as compared to currently available mechanisms.