Functional approach to data structures and its use in multidimensional searching
SIAM Journal on Computing
Text compression
Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
Compact pat trees
An analysis of the Burrows—Wheeler transform
Journal of the ACM (JACM)
Succinct representations of lcp information and improvements in the compressed suffix arrays
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Succinct indexable dictionaries with applications to encoding k-ary trees and multisets
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
High-order entropy-compressed text indexes
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Proceedings of the 16th Conference on Foundations of Software Technology and Theoretical Computer Science
New data structures for orthogonal range searching
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Succinct static data structures
Succinct static data structures
Compact representations of ordered sets
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Indexing text using the Ziv-Lempel trie
Journal of Discrete Algorithms - SPIRE 2002
New text indexing functionalities of the compressed suffix arrays
Journal of Algorithms
Journal of the ACM (JACM)
Boosting textual compression in optimal linear time
Journal of the ACM (JACM)
Compressed Suffix Arrays and Suffix Trees with Applications to Text Indexing and String Matching
SIAM Journal on Computing
Representing Trees of Higher Degree
Algorithmica
Squeezing succinct data structures into entropy bounds
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
When indexing equals compression: Experiments with compressing suffix arrays and applications
ACM Transactions on Algorithms (TALG)
ACM Computing Surveys (CSUR)
Note: A simple storage scheme for strings achieving entropy bounds
Theoretical Computer Science
Compressed representations of sequences and full-text indexes
ACM Transactions on Algorithms (TALG)
Optimal lower bounds for rank and select indexes
Theoretical Computer Science
Compressed dictionaries: space measures, data sets, and experiments
WEA'06 Proceedings of the 5th international conference on Experimental Algorithms
Dynamic entropy-compressed sequences and full-text indexes
CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
Rank-Sensitive data structures
SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval
Compressed data structures: Dictionaries and data-aware measures
Theoretical Computer Science
Dynamic entropy-compressed sequences and full-text indexes
ACM Transactions on Algorithms (TALG)
On Compact Representations of All-Pairs-Shortest-Path-Distance Matrices
CPM '08 Proceedings of the 19th annual symposium on Combinatorial Pattern Matching
Simple Random Access Compression
Fundamenta Informaticae
Engineering a compressed suffix tree implementation
Journal of Experimental Algorithmics (JEA)
Efficient Data Structures for the Orthogonal Range Successor Problem
COCOON '09 Proceedings of the 15th Annual International Conference on Computing and Combinatorics
Succinct Orthogonal Range Search Structures on a Grid with Applications to Text Indexing
WADS '09 Proceedings of the 11th International Symposium on Algorithms and Data Structures
Range Quantile Queries: Another Virtue of Wavelet Trees
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Faster entropy-bounded compressed suffix trees
Theoretical Computer Science
Dynamic extended suffix arrays
Journal of Discrete Algorithms
Simple compression code supporting random access and fast string matching
WEA'07 Proceedings of the 6th international conference on Experimental algorithms
Note: On compact representations of All-Pairs-Shortest-Path-Distance matrices
Theoretical Computer Science
Improved data structures for the orthogonal range successor problem
Computational Geometry: Theory and Applications
String retrieval for multi-pattern queries
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Colored range queries and document retrieval
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Space-efficient construction of Lempel-Ziv compressed text indexes
Information and Computation
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Counting colours in compressed strings
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Persistency in suffix trees with applications to string interval problems
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Space efficient wavelet tree construction
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Practical representations for web and social graphs
Proceedings of the 20th ACM international conference on Information and knowledge management
Top-k document retrieval in optimal time and linear space
Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms
Compact rich-functional binary relation representations
LATIN'10 Proceedings of the 9th Latin American conference on Theoretical Informatics
Bidirectional search in a string with wavelet trees and bidirectional matching statistics
Information and Computation
Space-efficient data-analysis queries on grids
ISAAC'11 Proceedings of the 22nd international conference on Algorithms and Computation
Improved algorithms for the range next value problem and applications
Theoretical Computer Science
Self-Indexed Grammar-Based Compression
Fundamenta Informaticae
Simple Random Access Compression
Fundamenta Informaticae
CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
SWAT'12 Proceedings of the 13th Scandinavian conference on Algorithm Theory
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
On compressing and indexing repetitive sequences
Theoretical Computer Science
Colored range queries and document retrieval
Theoretical Computer Science
Space-efficient data-analysis queries on grids
Theoretical Computer Science
Space efficient data structures for dynamic orthogonal range counting
Computational Geometry: Theory and Applications
On compressing permutations and adaptive sorting
Theoretical Computer Science
Compact binary relation representations with rich functionality
Information and Computation
Journal of Discrete Algorithms
Hi-index | 5.24 |
The deep connection between the Burrows-Wheeler transform (BWT) and the so-called rank and select data structures for symbol sequences is the basis of most successful approaches to compressed text indexing. Rank of a symbol at a given position equals the number of times the symbol appears in the corresponding prefix of the sequence. Select is the inverse, retrieving the positions of the symbol occurrences. It has been shown that improvements to rank/select algorithms, in combination with the BWT, turn into improved compressed text indexes. This paper is devoted to alternative implementations and extensions of rank and select data structures. First, we show that one can use gap encoding techniques to obtain constant time rank and select queries in essentially the same space as what is achieved by the best current direct solution (and sometimes less). Second, we extend symbol rank and select to substring rank and select, giving several space/time trade-offs for the problem. An application of these queries is in position-restricted substring searching, where one can specify the range in the text where the search is restricted to, and only occurrences residing in that range are to be reported. In addition, arbitrary occurrences are reported in text position order. Several byproducts of our results display connections with searchable partial sums, Chazelle's two-dimensional data structures, and Grossi et al.'s wavelet trees.