Functional approach to data structures and its use in multidimensional searching
SIAM Journal on Computing
Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
Compact pat trees
An analysis of the Burrows—Wheeler transform
Journal of the ACM (JACM)
Succinct indexable dictionaries with applications to encoding k-ary trees and multisets
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
High-order entropy-compressed text indexes
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Proceedings of the 16th Conference on Foundations of Software Technology and Theoretical Computer Science
Opportunistic data structures with applications
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
New data structures for orthogonal range searching
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
When indexing equals compression: experiments with compressing suffix arrays and applications
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Journal of the ACM (JACM)
Space-efficient static trees and graphs
SFCS '89 Proceedings of the 30th Annual Symposium on Foundations of Computer Science
Optimal prefix and suffix queries on texts
Information Processing Letters
Generalized Substring Compression
CPM '09 Proceedings of the 20th Annual Symposium on Combinatorial Pattern Matching
Compressing and indexing labeled trees, with applications
Journal of the ACM (JACM)
Efficient Index for Retrieving Top-k Most Frequent Documents
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Range Non-overlapping Indexing
ISAAC '09 Proceedings of the 20th International Symposium on Algorithms and Computation
Compression, indexing, and retrieval for massive string data
CPM'10 Proceedings of the 21st annual conference on Combinatorial pattern matching
Efficient index for retrieving top-k most frequent documents
Journal of Discrete Algorithms
Finding Patterns In Given Intervals
Fundamenta Informaticae
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
New algorithms on wavelet trees and applications to information retrieval
Theoretical Computer Science
Improved algorithms for the range next value problem and applications
Theoretical Computer Science
The wavelet trie: maintaining an indexed sequence of strings in compressed space
PODS '12 Proceedings of the 31st symposium on Principles of Database Systems
CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
Computing lempel-ziv factorization online
MFCS'12 Proceedings of the 37th international conference on Mathematical Foundations of Computer Science
Finding patterns in given intervals
MFCS'07 Proceedings of the 32nd international conference on Mathematical Foundations of Computer Science
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Succinct representations of weighted trees supporting path queries
Journal of Discrete Algorithms
On position restricted substring searching in succinct space
Journal of Discrete Algorithms
Extracting powers and periods in a word from its runs structure
Theoretical Computer Science
Journal of Discrete Algorithms
Hi-index | 0.00 |
A full-text index is a data structure built over a text string T[1,n]. The most basic functionality provided is (a) counting how many times a pattern string P[1,m] appears in T and (b) locating all those occ positions. There exist several indexes that solve (a) in O(m) time and (b) in O(occ) time. In this paper we propose two new queries, (c) counting how many times P[1,m] appears in T[l,r] and (d) locating all those occl,r positions. These can be solved using (a) and (b) but this requires O(occ) time. We present two solutions to (c) and (d) in this paper. The first is an index that requires O(nlog n) bits of space and answers (c) in O(m+log n) time and (d) in O(log n) time per occurrence (that is, O(occl,r log n) time overall). A variant of the first solution answers (c) in O(m+loglog n) time and (d) in constant time per occurrence, but requires O(nlog$^{\rm 1+{\it \epsilon}}$n) bits of space for any constant ε 0. The second solution requires O(nm log σ) bits of space, solving (c) in O(m⌈log σ / loglog n⌉) time and (d) in O(m⌈log σ / loglog n⌉) time per occurrence, where σ is the alphabet size. This second structure takes less space when the text is compressible. Our solutions can be seen as a generalization of rank and select dictionaries, which allow computing how many times a given character c appears in a prefix T[1,i] and also locate the i-th occurrence of c in T. Our solution to (c) extends character rank queries to substring rank queries, and our solution to (d) extends character select to substring select queries. As a byproduct, we show how rank queries can be used to implement fractional cascading in little space, so as to obtain an alternative implementation of a well-known two-dimensional range search data structure by Chazelle. We also show how Grossi et al.'s wavelet trees are suitable for two-dimensional range searching, and their connection with Chazelle's data structure.