Deterministic coin tossing with applications to optimal parallel list ranking
Information and Control
Faster optimal parallel prefix sums and list ranking
Information and Computation
Symmetry breaking for suffix tree construction
STOC '94 Proceedings of the twenty-sixth annual ACM symposium on Theory of computing
On the Performance of BWT Sorting Algorithms
DCC '00 Proceedings of the Conference on Data Compression
Linear work suffix array construction
Journal of the ACM (JACM)
Parallel Lossless Data Compression Based on the Burrows-Wheeler Transform
AINA '07 Proceedings of the 21st International Conference on Advanced Networking and Applications
Finding Biconnected Componemts And Computing Tree Functions In Logarithmic Parallel Time
SFCS '84 Proceedings of the 25th Annual Symposium onFoundations of Computer Science, 1984
Linear Suffix Array Construction by Almost Pure Induced-Sorting
DCC '09 Proceedings of the 2009 Data Compression Conference
Hi-index | 0.00 |
We present novel work-optimal PRAM algorithms for Burrows-Wheeler (BW) compression and decompression of strings over a constant alphabet. For a string of length n, the depth of the compression algorithm is O(log2 n), and the depth of the corresponding decompression algorithm is O(log n). These appear to be the first polylogarithmic-time work-optimal parallel algorithms for any standard lossless compression scheme. The algorithms for the individual stages of compression and decompression may also be of independent interest: 1. a novel O(log n)-time, O(n)-work PRAM algorithm for Huffman decoding; 2. original insights into the stages of the BW compression and decompression problems, bringing out parallelism that was not readily apparent. We then mapped such parallelism in interesting ways to elementary parallel routines that have O(log n)-time, O(n)-work solutions, such as: (i) prefix-sums problems with an appropriately-defined associative binary operator for several stages, and (ii) list ranking for the final stage of decompression (inverse blocksorting transform). Companion work reports empirical speedups of up to 25x for compression and up to 13x for decompression. This reflects a speedup of 70x over recent work on BW compression on GPUs.