Data compression via textual substitution
Journal of the ACM (JACM)
NVIDIA cuda software and gpu parallel computing architecture
Proceedings of the 6th international symposium on Memory management
Optimization principles and application performance evaluation of a multithreaded GPU using CUDA
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Programming Massively Parallel Processors: A Hands-on Approach
Programming Massively Parallel Processors: A Hands-on Approach
PacketShader: a GPU-accelerated software router
Proceedings of the ACM SIGCOMM 2010 conference
Parallel variable-length encoding on GPGPUs
Euro-Par'09 Proceedings of the 2009 international conference on Parallel processing
Database compression on graphics processors
Proceedings of the VLDB Endowment
Floating-point data compression at 75 Gb/s on a GPU
Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units
CULZSS: LZSS Lossless Data Compression on CUDA
CLUSTER '11 Proceedings of the 2011 IEEE International Conference on Cluster Computing
A universal algorithm for sequential data compression
IEEE Transactions on Information Theory
Hi-index | 0.00 |
In this paper, we present an algorithm and provide design improvements needed to port the serial Lempel-Ziv-Storer-Szymanski (LZSS), lossless data compression algorithm, to a parallelized version suitable for general purpose graphic processor units (GPGPU), specifically for NVIDIA's CUDA Framework. The two main stages of the algorithm, substring matching and encoding, are studied in detail to fit into the GPU architecture. We conducted detailed analysis of our performance results and compared them to serial and parallel CPU implementations of LZSS algorithm. We also benchmarked our algorithm in comparison with well known, widely used programs: GZIP and ZLIB. We achieved up to 34x better throughput than the serial CPU implementation of LZSS algorithm and up to 2.21x better than the parallelized version.