Compiler transformations for high-performance computing
ACM Computing Surveys (CSUR)
Improving Cache Locality by a Combination of Loop and Data Transformations
IEEE Transactions on Computers - Special issue on cache memory and related problems
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
MPI-The Complete Reference, Volume 1: The MPI Core
MPI-The Complete Reference, Volume 1: The MPI Core
MPI-2: Extending the Message-Passing Interface
Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing - Volume I
An MPI Implementation of the BLACS
HIPC '96 Proceedings of the Third International Conference on High-Performance Computing (HiPC '96)
LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Array regrouping and structure splitting using whole-program reference affinity
Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
Practical Structure Layout Optimization and Advice
Proceedings of the International Symposium on Code Generation and Optimization
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Shape analysis for composite data structures
CAV'07 Proceedings of the 19th international conference on Computer aided verification
Automatic Realignment of Data Structures to Improve MPI Performance
ICN '10 Proceedings of the 2010 Ninth International Conference on Networks
Hi-index | 0.00 |
Message Passing Interface (MPI) messages are centered around transmitting instances of MPI data types. The data types represented in MPI terms are usually modeled after data types native to the application. If a user does not want to transmit a field from the native data type, the user will sometimes align the MPI data type such that there is a gap in the displacement where the omitted field would be. With the resulting MPI data type now being non-contiguous, cycles are spent making the data contiguous for transmission and then expanding the data back out on the receiving side. We show that by performing data type fission (the process of segregating the transmitted fields from the non-transmitted fields) and aligning the MPI data type accordingly, we can completely eliminate the need to copy data during the packing and unpacking process, which can significantly improve the performance of communication-heavy Single Program Multiple Data applications.