A locally adaptive data compression scheme
Communications of the ACM
New indices for text: PAT Trees and PAT arrays
Information retrieval
Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
An introduction to Kolmogorov complexity and its applications (2nd ed.)
An introduction to Kolmogorov complexity and its applications (2nd ed.)
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
ACM Transactions on Information Systems (TOIS)
A Space-Economical Suffix Tree Construction Algorithm
Journal of the ACM (JACM)
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Reducing the space requirement of suffix trees
Software—Practice & Experience
Membership in Constant Time and Almost-Minimum Space
SIAM Journal on Computing
An experimental study of an opportunistic index
SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
Journal of Algorithms
Time-space trade-offs for compressed suffix arrays
Information Processing Letters
Succinct representations of lcp information and improvements in the compressed suffix arrays
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Succinct indexable dictionaries with applications to encoding k-ary trees and multisets
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Second step algorithms in the Burrows-Wheeler compression algorithm
Software—Practice & Experience
Low Redundancy in Static Dictionaries with Constant Query Time
SIAM Journal on Computing
Succinct Representation of Balanced Parentheses and Static Trees
SIAM Journal on Computing
Burrows--Wheeler compression with variable length integer codes
Software—Practice & Experience
High-order entropy-compressed text indexes
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications
CPM '01 Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching
Efficient Discovery of Proximity Patterns with Suffix Arrays
CPM '01 Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching
SEQUENCES '97 Proceedings of the Compression and Complexity of Sequences 1997
Can We Do without Ranks in Burrows Wheeler Transform Compression?
DCC '01 Proceedings of the Data Compression Conference
Breaking a Time-and-Space Barrier in Constructing Full-Text Indices
FOCS '03 Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science
Fast Compression with a Static Model in High-Order Entropy
DCC '04 Proceedings of the Conference on Data Compression
When indexing equals compression: experiments with compressing suffix arrays and applications
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Replacing suffix trees with enhanced suffix arrays
Journal of Discrete Algorithms - SPIRE 2002
New text indexing functionalities of the compressed suffix arrays
Journal of Algorithms
The level ancestor problem simplified
Theoretical Computer Science - Latin American theorotical informatics
Journal of the ACM (JACM)
Boosting textual compression in optimal linear time
Journal of the ACM (JACM)
Compressed Suffix Arrays and Suffix Trees with Applications to Text Indexing and String Matching
SIAM Journal on Computing
Compressed representations of sequences and full-text indexes
ACM Transactions on Algorithms (TALG)
Rank and select revisited and extended
Theoretical Computer Science
Compressed text indexes: From theory to practice
Journal of Experimental Algorithmics (JEA)
The myriad virtues of Wavelet Trees
Information and Computation
Rank/select on dynamic compressed sequences and applications
Theoretical Computer Science
LATIN'08 Proceedings of the 8th Latin American conference on Theoretical informatics
Improved dynamic rank-select entropy-bound structures
LATIN'08 Proceedings of the 8th Latin American conference on Theoretical informatics
Move-to-Front, Distance Coding, and Inversion Frequencies revisited
Theoretical Computer Science
Compression, indexing, and retrieval for massive string data
CPM'10 Proceedings of the 21st annual conference on Combinatorial pattern matching
ACM Transactions on Algorithms (TALG)
Succinct geometric indexes supporting point location queries
ACM Transactions on Algorithms (TALG)
The wavelet trie: maintaining an indexed sequence of strings in compressed space
PODS '12 Proceedings of the 31st symposium on Principles of Database Systems
Compressed data structures with relevance
Proceedings of the 21st ACM international conference on Information and knowledge management
Hi-index | 0.00 |
We report on a new experimental analysis of high-order entropy-compressed suffix arrays, which retains the theoretical performance of previous work and represents an improvement in practice. Our experiments indicate that the resulting text index offers state-of-the-art compression. In particular, we require roughly 20% of the original text size---without requiring a separate instance of the text. We can additionally use a simple notion to encode and decode block-sorting transforms (such as the Burrows--Wheeler transform), achieving a compression ratio comparable to that of bzip2. We also provide a compressed representation of suffix trees (and their associated text) in a total space that is comparable to that of the text alone compressed with gzip.