Server-directed collective I/O in Panda
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Network performance effects of HTTP/1.1, CSS1, and PNG
SIGCOMM '97 Proceedings of the ACM SIGCOMM '97 conference on Applications, technologies, architectures, and protocols for computer communication
Potential benefits of delta encoding and data compression for HTTP
SIGCOMM '97 Proceedings of the ACM SIGCOMM '97 conference on Applications, technologies, architectures, and protocols for computer communication
Remote I/O: fast access to distant storage
Proceedings of the fifth workshop on I/O in parallel and distributed systems
ICS '01 Proceedings of the 15th international conference on Supercomputing
Lossless Compression of High-volume Numerical Data from Simulations
DCC '00 Proceedings of the Conference on Data Compression
Compression of Biological Sequences by Greedy Off-Line Textual Substitution
DCC '00 Proceedings of the Conference on Data Compression
A data management approach for handling large compressed arrays in high performance computing
FRONTIERS '95 Proceedings of the Fifth Symposium on the Frontiers of Massively Parallel Computation (Frontiers'95)
Data Sieving and Collective I/O in ROMIO
FRONTIERS '99 Proceedings of the The 7th Symposium on the Frontiers of Massively Parallel Computation
Design and Implementation of a Parallel I/O Runtime System for Irregular Applications
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
ICS '02 Proceedings of the 16th international conference on Supercomputing
On the viability of checkpoint compression for extreme scale fault tolerance
Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing - Volume 2
Compression-aware I/O performance analysis for big data clustering
Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications
Hi-index | 0.00 |
Scientific simulations often produce large volumes of output that are moved to another platform for visualization or storage. This long-distance migration is slow due to the data size and slow network. Compression can improve migration performance by reducing the data size, but compression is computation-intensive and so can raise costs. In this work, we show how to reduce data migration cost by incorporating compression into migration. We analyze eight scientific data sets, and propose three approaches for parallel compression of scientific data. Our results show that with reasonably fast processors and typical parallel configurations, the compression cost for large scientific data is outweighed by the performance gain obtained by migrating less data. We found that a client-side compression approach (CC) can improve I/O and migration performance by an order of magnitude. In our experiments, CC always matches or outperforms migration without compression when we overlap migration with computation, even for not very compressible dense floating point data. We also present a variant of CC that is well suited for use with implementations of two-phase I/O.