Realizing the performance potential of the virtual interface architecture
ICS '99 Proceedings of the 13th international conference on Supercomputing
Optimizing threaded MPI execution on SMP clusters
ICS '01 Proceedings of the 15th international conference on Supercomputing
The Virtual Interface Architecture
IEEE Micro
High performance RDMA-based MPI implementation over InfiniBand
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Differential FCM: Increasing Value Prediction Accuracy by Improving Table Usage Efficiency
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Compressing Extended Program Traces Using Value Predictors
Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
VPC3: a fast and effective trace-compression algorithm
Proceedings of the joint international conference on Measurement and modeling of computer systems
Optimizing irregular shared-memory applications for clusters
Proceedings of the 22nd annual international conference on Supercomputing
MPI Reduction Operations for Sparse Floating-point Data
Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
CoMPI: Enhancing MPI Based Applications Performance and Scalability Using Run-Time Compression
Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Transparent neutral element elimination in MPI reduction operations
EuroMPI'10 Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface
International Journal of High Performance Computing Applications
Floating-point data compression at 75 Gb/s on a GPU
Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units
Tolerating message latency through the early release of blocked receives
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
An adaptive, scalable, and portable technique for speeding up MPI-based applications
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Hi-index | 0.00 |
Communication-intensive parallel applications spend a significant amount of their total execution time exchanging data between processes, which leads to poor performance in many cases. In this paper, we investigate message compression in the context of large-scale parallel message-passing systems to reduce the communication time of individual messages and to improve the bandwidth of the overall system. We implement and evaluate the cMPImessage-passing library, which quickly compresses messages on-the-fly with a low enough overhead that a net execution time reduction is obtained. Our results on six large-scale benchmark applications show that their execution speed improves by up to 98% when message compression is enabled.