Self-adjusting binary search trees
Journal of the ACM (JACM)
A locally adaptive data compression scheme
Communications of the ACM
Software—Practice & Experience
Overview of the second text retrieval conference (TREC-2)
TREC-2 Proceedings of the second conference on Text retrieval conference
Adding compression to a full-text retrieval system
Software—Practice & Experience
ACM Transactions on Information Systems (TOIS)
Fast and flexible word searching on compressed text
ACM Transactions on Information Systems (TOIS)
Algorithms and Data Structures: Design, Correctness, Analysis
Algorithms and Data Structures: Design, Correctness, Analysis
Compression and Coding Algorithms
Compression and Coding Algorithms
Enhanced word-based block-sorting text compression
ACSC '02 Proceedings of the twenty-fifth Australasian conference on Computer science - Volume 4
Second step algorithms in the Burrows-Wheeler compression algorithm
Software—Practice & Experience
Text Compression for Dynamic Document Databases
IEEE Transactions on Knowledge and Data Engineering
New Techniques in Context Modeling and Arithmetic Encoding
DCC '96 Proceedings of the Conference on Data Compression
A Fast Block-Sorting Algorithm for Lossless Data Compression
DCC '97 Proceedings of the Conference on Data Compression
DCC '97 Proceedings of the Conference on Data Compression
Universal Lossless Source Coding with the Burrows Wheeler Transform
DCC '99 Proceedings of the Conference on Data Compression
Modifications of the Burrows and Wheeler Data Compression Algorithm
DCC '99 Proceedings of the Conference on Data Compression
DCC '00 Proceedings of the Conference on Data Compression
Move-to-Front and Inversion Coding
DCC '00 Proceedings of the Conference on Data Compression
On the Performance of BWT Sorting Algorithms
DCC '00 Proceedings of the Conference on Data Compression
PPM Performance with BWT Complexity: A New Method for Lossless Data Compression
DCC '00 Proceedings of the Conference on Data Compression
A Fast Algorithms for Making Suffix Arrays and for Burrows-Wheeler Transformation
DCC '98 Proceedings of the Conference on Data Compression
DCC '02 Proceedings of the Data Compression Conference
Can We Do without Ranks in Burrows Wheeler Transform Compression?
DCC '01 Proceedings of the Data Compression Conference
Space-Time Tradeoffs in the Inverse B-W Transform
DCC '01 Proceedings of the Data Compression Conference
Edge-guided natural language text compression
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
Natural Language Compression on Edge-Guided text preprocessing
Information Sciences: an International Journal
Mapping words into codewords on PPM
SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
Generalized biwords for bitext compression and translation spotting
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
Block-sorting is an innovative compression mechanism introduced in 1994 by Burrows and Wheeler. It involves three steps: permuting the input one block at a time through the use of the Burrows-Wheeler transform (bwt); applying a move-to-front (mtf) transform to each of the permuted blocks; and then entropy coding the output with a Huffman or arithmetic coder. Until now, block-sorting implementations have assumed that the input message is a sequence of characters. In this paper we extend the block-sorting mechanism to word-based models. We also consider other recency transformations, and are able to show improved compression results compared to mtf and uniform arithmetic coding. For large files of text, the combination of word-based modeling, bwt, and mtf-like transformations allows excellent compression effectiveness to be attained within reasonable resource costs.