Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
An analysis of the Burrows—Wheeler transform
Journal of the ACM (JACM)
Succinct indexable dictionaries with applications to encoding k-ary trees and multisets
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
High-order entropy-compressed text indexes
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Opportunistic data structures with applications
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Breaking a Time-and-Space Barrier in Constructing Full-Text Indices
FOCS '03 Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science
When indexing equals compression: experiments with compressing suffix arrays and applications
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Compression boosting in optimal linear time using the Burrows-Wheeler Transform
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Journal of the ACM (JACM)
Large alphabets and incompressibility
Information Processing Letters
Succinct suffix arrays based on run-length encoding
Nordic Journal of Computing
Compressed representations of sequences and full-text indexes
ACM Transactions on Algorithms (TALG)
Space-efficient static trees and graphs
SFCS '89 Proceedings of the 30th Annual Symposium on Foundations of Computer Science
Space-efficient construction of LZ-index
ISAAC'05 Proceedings of the 16th international conference on Algorithms and Computation
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
The myriad virtues of wavelet trees
ICALP'06 Proceedings of the 33rd international conference on Automata, Languages and Programming - Volume Part I
The myriad virtues of Wavelet Trees
Information and Computation
Move-to-Front, Distance Coding, and Inversion Frequencies revisited
Theoretical Computer Science
Compression, indexing, and retrieval for massive string data
CPM'10 Proceedings of the 21st annual conference on Combinatorial pattern matching
Data structures: time, I/Os, entropy, joules!
ESA'10 Proceedings of the 18th annual European conference on Algorithms: Part II
Medium-space algorithms for inverse BWT
ESA'10 Proceedings of the 18th annual European conference on Algorithms: Part I
Compressed string dictionaries
SEA'11 Proceedings of the 10th international conference on Experimental algorithms
Practical compressed document retrieval
SEA'11 Proceedings of the 10th international conference on Experimental algorithms
Counting colours in compressed strings
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
LRM-trees: compressed indices, adaptive sorting, and compressed permutations
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Fixed block compression boosting in FM-indexes
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
New algorithms on wavelet trees and applications to information retrieval
Theoretical Computer Science
Fast relative lempel-ziv self-index for similar sequences
FAW-AAIM'12 Proceedings of the 6th international Frontiers in Algorithmics, and Proceedings of the 8th international conference on Algorithmic Aspects in Information and Management
Efficient in-memory top-k document retrieval
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
LRM-Trees: Compressed indices, adaptive sorting, and compressed permutations
Theoretical Computer Science
Compressed data structures with relevance
Proceedings of the 21st ACM international conference on Information and knowledge management
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Smaller self-indexes for natural language
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Journal of Discrete Algorithms
Compressed indexes for text with wildcards
Theoretical Computer Science
Colored range queries and document retrieval
Theoretical Computer Science
Hi-index | 0.00 |
Compression boosting (Ferragina & Manzini, SODA 2004) is a new technique to enhance zeroth order entropy compressors' performance to k-th order entropy. It works by constructing the Burrows-Wheeler transform of the input text, finding optimal partitioning of the transform, and then compressing each piece using an arbitrary zeroth order compressor. The optimal partitioning has the property that the achieved compression is boosted to k-th order entropy, for any k. The technique has an application to text indexing: Essentially, building a wavelet tree (Grossi et al., SODA 2003) for each piece in the partitioning yields a k-th order compressed full-text self-index providing efficient substring searches on the indexed text (Ferragina et al., SPIRE 2004). In this paper, we show that using explicit compression boosting with wavelet trees is not necessary; our new analysis reveals that the size of the wavelet tree built for the complete Burrows-Wheeler transformed text is, in essence, the sum of those built for the pieces in the optimal partitioning. Hence, the technique provides a way to do compression boosting implicitly, with a trivial linear time algorithm, but fixed to a specific zeroth order compressor (Raman et al., SODA 2002). In addition to having these consequences on compression and static full-text self-indexes, the analysis shows that a recent dynamic zeroth order compressed self-index (Mäkinen & Navarro, CPM 2006) occupies in fact space proportional to k-th order entropy.