An implicit binomial queue with constant insertion time
No. 318 on SWAT 88: 1st Scandinavian workshop on algorithm theory
Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
PATRICIA—Practical Algorithm To Retrieve Information Coded in Alphanumeric
Journal of the ACM (JACM)
An analysis of the Burrows—Wheeler transform
Journal of the ACM (JACM)
Succinct indexable dictionaries with applications to encoding k-ary trees and multisets
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Efficient algorithms for document retrieval problems
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Modern Information Retrieval
High-order entropy-compressed text indexes
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
New Upper Bounds for Generalized Intersection Searching Problems
ICALP '95 Proceedings of the 22nd International Colloquium on Automata, Languages and Programming
Proceedings of the 16th Conference on Foundations of Software Technology and Theoretical Computer Science
Augmenting Suffix Trees, with Applications
ESA '98 Proceedings of the 6th Annual European Symposium on Algorithms
Scaling and related techniques for geometry problems
STOC '84 Proceedings of the sixteenth annual ACM symposium on Theory of computing
New text indexing functionalities of the compressed suffix arrays
Journal of Algorithms
Succinct suffix arrays based on run-length encoding
Nordic Journal of Computing
ACM Computing Surveys (CSUR)
Succinct data structures for flexible text retrieval systems
Journal of Discrete Algorithms
Note: A simple storage scheme for strings achieving entropy bounds
Theoretical Computer Science
Compressed representations of sequences and full-text indexes
ACM Transactions on Algorithms (TALG)
Succinct indexes for strings, binary relations and multi-labeled trees
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Optimal lower bounds for rank and select indexes
Theoretical Computer Science
Rank and select revisited and extended
Theoretical Computer Science
A survey of top-k query processing techniques in relational database systems
ACM Computing Surveys (CSUR)
Approximate colored range and point enclosure queries
Journal of Discrete Algorithms
Efficient Colored Orthogonal Range Counting
SIAM Journal on Computing
Space-Efficient Algorithms for Document Retrieval
CPM '07 Proceedings of the 18th annual symposium on Combinatorial Pattern Matching
Compressed Text Indexes with Fast Locate
CPM '07 Proceedings of the 18th annual symposium on Combinatorial Pattern Matching
On the Redundancy of Succinct Data Structures
SWAT '08 Proceedings of the 11th Scandinavian workshop on Algorithm Theory
Linear pattern matching algorithms
SWAT '73 Proceedings of the 14th Annual Symposium on Switching and Automata Theory (swat 1973)
Range mode and range median queries in constant time and sub-quadratic space
Information Processing Letters
Range Quantile Queries: Another Virtue of Wavelet Trees
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Efficient Index for Retrieving Top-k Most Frequent Documents
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Space-Efficient Framework for Top-k String Retrieval Problems
FOCS '09 Proceedings of the 2009 50th Annual IEEE Symposium on Foundations of Computer Science
Implicit compression boosting with applications to self-indexing
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
Fully-functional succinct trees
SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
Cell probe lower bounds and approximations for range mode
ICALP'10 Proceedings of the 37th international colloquium conference on Automata, languages and programming
Optimal trade-offs for succinct string indexes
ICALP'10 Proceedings of the 37th international colloquium conference on Automata, languages and programming
Top-k ranked document search in general text databases
ESA'10 Proceedings of the 18th annual European conference on Algorithms: Part II
Colored range queries and document retrieval
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Succinct representations of dynamic strings
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Theoretical Computer Science
Practical compressed document retrieval
SEA'11 Proceedings of the 10th international conference on Experimental algorithms
Counting colours in compressed strings
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Improved compressed indexes for full-text document retrieval
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Top-k document retrieval in optimal time and linear space
Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms
Optimal succinctness for range minimum queries
LATIN'10 Proceedings of the 9th Latin American conference on Theoretical Informatics
Random access to grammar-compressed strings
Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
Top-K color queries for document retrieval
Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
Applications of web query mining
ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
New algorithms on wavelet trees and applications to information retrieval
Theoretical Computer Science
Towards an optimal space-and-query-time index for top-k document retrieval
CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
A new succinct representation of RMQ-information and improvements in the enhanced suffix array
ESCAPE'07 Proceedings of the First international conference on Combinatorics, Algorithms, Probabilistic and Experimental Methodologies
New lower and upper bounds for representing sequences
ESA'12 Proceedings of the 20th Annual European conference on Algorithms
Spaces, Trees, and Colors: The algorithmic landscape of document retrieval on sequences
ACM Computing Surveys (CSUR)
Hi-index | 5.23 |
Colored range queries are a well-studied topic in computational geometry and database research that, in the past decade, have found exciting applications in information retrieval. In this paper, we give improved time and space bounds for three important one-dimensional colored range queries - colored range listing, colored range top-k queries and colored range counting - and, as a consequence, new bounds for various document retrieval problems on general collections of sequences. Colored range listing is the problem of preprocessing a sequence S[1,n] of colors so that, later, given an interval [i,i+@?-1], we list the different colors in S[i,i+@?-1]. Colored range top-k queries ask instead for k most frequent colors in the interval. Colored range counting asks for the number of different colors in the interval. We first describe a framework including almost all recent results on colored range listing and document listing, which suggests new combinations of data structures for these problems. For example, we give the first compressed data structure (using nH"k(S)+o(nlog@s) bits, for any k=o(log"@sn), where H"k(S) is the k-th order empirical entropy of S and @s the number of different colors in S) that answers colored range listing queries in constant time per returned result. We also give an efficient data structure for document listing whose size is bounded in terms of the k-th order entropy of the library of documents. We then show how (approximate) colored top-k queries can be reduced to (approximate) range-mode queries on subsequences, yielding the first efficient data structure for this problem. Finally, we show how modified wavelet trees can support colored range counting using nH"0(S)+O(n)+o(nH"0(S)) bits, and answer queries in O(log@?) time. As far as we know, this is the first data structure in which the query time depends only on @? and not on n. We also show how our data structure can be made dynamic.