Modifications of the Burrows and Wheeler Data Compression Algorithm

Authors:
Bernhard Balkenhol;Stefan Kurtz;Yuri M. Shtarkov
Affiliations:
-;-;-
Venue:
DCC '99 Proceedings of the Conference on Data Compression
Year:
1999

Citing 0
Cited 20

Word-based block-sorting text compression

ACSC '01 Proceedings of the 24th Australasian conference on Computer science
Enhanced word-based block-sorting text compression

ACSC '02 Proceedings of the twenty-fifth Australasian conference on Computer science - Volume 4
Burrows--Wheeler compression with variable length integer codes

Software—Practice & Experience
Invited Lecture: The Burrows-Wheeler Transform: Theory and Practice

MFCS '99 Proceedings of the 24th International Symposium on Mathematical Foundations of Computer Science
Switching Between Two On-line List Update Algorithms for Higher Compression of Burrows-Wheeler Transformed Data

DCC '00 Proceedings of the Conference on Data Compression
Move-to-Front and Inversion Coding

DCC '00 Proceedings of the Conference on Data Compression
Parsing Strategies for BWT Compression

DCC '01 Proceedings of the Data Compression Conference
Can We Do without Ranks in Burrows Wheeler Transform Compression?

DCC '01 Proceedings of the Data Compression Conference
Prototyping of Efficient Hardware Algorithms for Data Compression in Future Communication Systems

RSP '01 Proceedings of the 12th International Workshop on Rapid System Prototyping
Compression boosting in optimal linear time using the Burrows-Wheeler Transform

SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Word-based text compression using the Burrows-Wheeler transform

Information Processing and Management: an International Journal
Boosting textual compression in optimal linear time

Journal of the ACM (JACM)
Efficient Algorithms for the Inverse Sort Transform

IEEE Transactions on Computers
Faster suffix sorting

Theoretical Computer Science
An Application of Self-organizing Data Structures to Compression

SEA '09 Proceedings of the 8th International Symposium on Experimental Algorithms
Word-based text compression using the Burrows-Wheeler transform

Information Processing and Management: an International Journal
Move-to-Front, Distance Coding, and Inversion Frequencies revisited

Theoretical Computer Science
Post BWT stages of the Burrows–Wheeler compression algorithm

Software—Practice & Experience
Computing the inverse sort transform in linear time

ACM Transactions on Algorithms (TALG)
Suffix tree based data compression

SOFSEM'05 Proceedings of the 31st international conference on Theory and Practice of Computer Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

In 1994 Burrows and Wheeler [3] described a universal data compression algorithm (BW-algorithm, for short) which achieved compression rates that were close to the best known compression rates. Due to it's simplicity, the algorithm can be implemented with relatively low complexity. Fenwick [5] described ideas to improve the efficiency (i.e. the compression rate) and complexity of the BW-algorithm. He also discusses relationships of the algorithm with other compression methods. Schindler [12] proposed a Burrows and Wheeler Transformation (BWT, for short) that is based on a limited ordering. This speeds up the algorithm for compression, but slows it down for decompression and slightly decreases the efficiency. Larsson [8] describes relationship of the BWT with suffix trees and with context trees. Sadakane [11] suggests a method to compute the BWT faster, and compares it to other methods. Recently Balkenhol and Kurtz [1] gave a thorough analysis of the BWT from an information theoretic point of view. They described implementation techniques for data compression algorithms based on the BWT, and developed a program with a better compression rate.In this paper we improve upon these previous results on the BW-algorithm. Based on the context tree model, we consider the speci_c statistical properties of the data at the output of the BWT. We describe six important properties, three of which have not been described elsewhere. These considerations lead to modifications of the coding method, which in turn improve the coding efficiency. We shortly describe how to compute the BWT with low complexity in time and space, using suffix trees in two different representations. Finally, we present experimental results about the compression rate and running time of our method, and compare these results to previous achievements. More references on the methods described in this paper can be found in [1, 5].