Text compression
New indices for text: PAT Trees and PAT arrays
Information retrieval
Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
An analysis of the Burrows—Wheeler transform
Journal of the ACM (JACM)
Succinct indexable dictionaries with applications to encoding k-ary trees and multisets
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
High-order entropy-compressed text indexes
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Optimal Algorithms for List Indexing and Subset Rank
WADS '89 Proceedings of the Workshop on Algorithms and Data Structures
Opportunistic data structures with applications
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Breaking a Time-and-Space Barrier in Constructing Full-Text Indices
FOCS '03 Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science
Compact representations of ordered sets
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Tight bounds for the partial-sums problem
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
When indexing equals compression: experiments with compressing suffix arrays and applications
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Compression boosting in optimal linear time using the Burrows-Wheeler Transform
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Journal of the ACM (JACM)
Compressed Suffix Arrays and Suffix Trees with Applications to Text Indexing and String Matching
SIAM Journal on Computing
Rank/select operations on large alphabets: a tool for text indexing
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Squeezing succinct data structures into entropy bounds
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Succinct suffix arrays based on run-length encoding
Nordic Journal of Computing
ACM Computing Surveys (CSUR)
Note: A simple storage scheme for strings achieving entropy bounds
Theoretical Computer Science
Compressed representations of sequences and full-text indexes
ACM Transactions on Algorithms (TALG)
Compressed indexes for dynamic text collections
ACM Transactions on Algorithms (TALG)
Rank and select revisited and extended
Theoretical Computer Science
Improved dynamic rank-select entropy-bound structures
LATIN'08 Proceedings of the 8th Latin American conference on Theoretical informatics
Statistical encoding of succinct data structures
CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
Space-efficient construction of LZ-index
ISAAC'05 Proceedings of the 16th international conference on Algorithms and Computation
The myriad virtues of wavelet trees
ICALP'06 Proceedings of the 33rd international conference on Automata, Languages and Programming - Volume Part I
Dynamic rank-select structures with applications to run-length encoded texts
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
An Improved Succinct Representation for Dynamic k-ary Trees
CPM '08 Proceedings of the 19th annual symposium on Combinatorial Pattern Matching
Compressed text indexes: From theory to practice
Journal of Experimental Algorithmics (JEA)
Engineering a compressed suffix tree implementation
Journal of Experimental Algorithmics (JEA)
A four-stage algorithm for updating a Burrows-Wheeler transform
Theoretical Computer Science
Dynamic rank/select structures with applications to run-length encoded texts
Theoretical Computer Science
Rank/select on dynamic compressed sequences and applications
Theoretical Computer Science
Compressed Suffix Arrays for Massive Data
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Directly Addressable Variable-Length Codes
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Dynamic extended suffix arrays
Journal of Discrete Algorithms
The compressed permuterm index
ACM Transactions on Algorithms (TALG)
Fully-functional succinct trees
SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
Approximate all-pairs suffix/prefix overlaps
CPM'10 Proceedings of the 21st annual conference on Combinatorial pattern matching
Compression, indexing, and retrieval for massive string data
CPM'10 Proceedings of the 21st annual conference on Combinatorial pattern matching
Practical approaches to reduce the space requirement of lempel-ziv--based compressed text indices
Journal of Experimental Algorithmics (JEA)
Succinct representations of dynamic strings
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Space-efficient construction of Lempel-Ziv compressed text indexes
Information and Computation
Succinct nearest neighbor search
Proceedings of the Fourth International Conference on SImilarity Search and APplications
ACM Transactions on Algorithms (TALG)
Word-based self-indexes for natural language text
ACM Transactions on Information Systems (TOIS)
Compact rich-functional binary relation representations
LATIN'10 Proceedings of the 9th Latin American conference on Theoretical Informatics
Approximate all-pairs suffix/prefix overlaps
Information and Computation
Space-efficient data-analysis queries on grids
ISAAC'11 Proceedings of the 22nd international conference on Algorithms and Computation
The wavelet trie: maintaining an indexed sequence of strings in compressed space
PODS '12 Proceedings of the 31st symposium on Principles of Database Systems
CRAM: compressed random access memory
ICALP'12 Proceedings of the 39th international colloquium conference on Automata, Languages, and Programming - Volume Part I
Least random suffix/prefix matches in output-sensitive time
CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
Computing lempel-ziv factorization online
MFCS'12 Proceedings of the 37th international conference on Mathematical Foundations of Computer Science
Compressing IP forwarding tables for fun and profit
Proceedings of the 11th ACM Workshop on Hot Topics in Networks
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Space-efficient data-analysis queries on grids
Theoretical Computer Science
Compressing IP forwarding tables: towards entropy bounds and beyond
Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
Space efficient data structures for dynamic orthogonal range counting
Computational Geometry: Theory and Applications
Compressed property suffix trees
Information and Computation
Hi-index | 0.00 |
We give new solutions to the Searchable Partial Sums with Indels problem. Given a sequence of n k-bit numbers, we present a structure taking kn + o(kn) bits of space, able of performing operations sum, search, insert, and delete, all in O(log n) worst-case time, for any k = O(log n). This extends previous results by Hon et al. [2003c] achieving the same space and O(log n/log log n) time complexities for the queries, yet offering complexities for insert and delete that are amortized and worse than ours, and supported only for k = O(1). Our result matches an existing lower bound for large values of k. We also give new solutions to the Dynamic Sequence problem. Given a sequence of n symbols in the range [1,σ] with binary zero-order entropy H0, we present a dynamic data structure that requires nH0 + o(n log σ) bits of space, which is able of performing rank and select, as well as inserting and deleting symbols at arbitrary positions, in O(log n log σ) time. Our result is the first entropy-bound dynamic data structure for rank and select over general sequences. In the case σ = 2, where both previous problems coincide, we improve the dynamic solution of Hon et al. [2003c] in that we compress the sequence. The only previous result with entropy-bound space for dynamic binary sequences is by Blandford and Blelloch [2004], which has the same complexities as our structure, but does not achieve constant 1 multiplying the entropy term in the space complexity. Finally, we present a new dynamic compressed full-text self-index, for a collection of texts over an alphabet of size σ, of overall length n and hth order empirical entropy Hh. The index requires nHh + o(n log σ) bits of space, for any h ≤ α logsigma n and constant 0 m in time O(m log n log σ). Each such occurrence can be reported in O(log2nlog log n) time, and displaying a context of length ℓ from a text takes time O(log n(ℓ log σ + log n log log n)). Insertion/deletion of a text to/from the collection takes O(log n log σ) time per symbol. This significantly improves the space of a previous result by Chan et al. [2004] in exchange for a slight time complexity penalty. We achieve at the same time the first dynamic index requiring essentially nHh bits of space, and the first construction of a compressed full-text self-index within that working space. Previous results achieve at best O(nHh space with constants larger than 1 [Ferragina and Manzini 2000; Arroyuelo and Navarro 2005] and higher time complexities. An important result we prove in this paper is that the wavelet tree of the Burrows-Wheeler transform of a text, if compressed with a technique that achieves zero-order compression locally (e.g., Raman et al. [2002]), automatically achieves hth order entropy space for any h. This unforeseen relation is essential for the results of the previous paragraph, but it also derives into significant simplifications on many existing static compressed full-text self-indexes that build on wavelet trees.