Functional approach to data structures and its use in multidimensional searching
SIAM Journal on Computing
PATRICIA—Practical Algorithm To Retrieve Information Coded in Alphanumeric
Journal of the ACM (JACM)
Compression of Low Entropy Strings with Lempel--Ziv Algorithms
SIAM Journal on Computing
An analysis of the Burrows—Wheeler transform
Journal of the ACM (JACM)
Succinct indexable dictionaries with applications to encoding k-ary trees and multisets
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Succinct Representation of Balanced Parentheses and Static Trees
SIAM Journal on Computing
High-order entropy-compressed text indexes
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Proceedings of the 16th Conference on Foundations of Software Technology and Theoretical Computer Science
Succinct ordinal trees with level-ancestor queries
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Indexing text using the Ziv-Lempel trie
Journal of Discrete Algorithms - SPIRE 2002
New text indexing functionalities of the compressed suffix arrays
Journal of Algorithms
Journal of the ACM (JACM)
Structuring labeled trees for optimal succinctness, and beyond
FOCS '05 Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science
Representing Trees of Higher Degree
Algorithmica
Squeezing succinct data structures into entropy bounds
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Succinct representations of permutations
ICALP'03 Proceedings of the 30th international conference on Automata, languages and programming
A compressed self-index using a Ziv---Lempel dictionary
Information Retrieval
Implementing the LZ-index: Theory versus practice
Journal of Experimental Algorithmics (JEA)
Compressed text indexes: From theory to practice
Journal of Experimental Algorithmics (JEA)
Self-indexing Natural Language
SPIRE '08 Proceedings of the 15th International Symposium on String Processing and Information Retrieval
Run-Length Compressed Indexes Are Superior for Highly Repetitive Sequence Collections
SPIRE '08 Proceedings of the 15th International Symposium on String Processing and Information Retrieval
Compressing and indexing labeled trees, with applications
Journal of the ACM (JACM)
Approximate string matching with Lempel-Ziv compressed indexes
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
Engineering basic algorithms of an in-memory text search engine
ACM Transactions on Information Systems (TOIS)
Practical approaches to reduce the space requirement of lempel-ziv--based compressed text indices
Journal of Experimental Algorithmics (JEA)
A compressed self-index using a ziv-lempel dictionary
SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
Self-Indexed Grammar-Based Compression
Fundamenta Informaticae
A Lempel-Ziv text index on secondary storage
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
Improved grammar-based compressed indexes
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
ESP-index: A compressed index based on edit-sensitive parsing
Journal of Discrete Algorithms
On compressing and indexing repetitive sequences
Theoretical Computer Science
Hi-index | 0.00 |
The LZ-index is a compressed full-text self-index able to represent a text P1...m, over an alphabet of size $\sigma = O(\textrm{polylog}(u))$ and with k-th order empirical entropy Hk(T), using 4uHk(T) + o(ulogσ) bits for any k = o(logσu). It can report all the occ occurrences of a pattern P1...m in T in O(m3logσ + (m + occ)logu) worst case time. Its main drawback is the factor 4 in its space complexity, which makes it larger than other state-of-the-art alternatives. In this paper we present two different approaches to reduce the space requirement of LZ-index. In both cases we achieve (2 + ε)uHk(T) + o(ulogσ) bits of space, for any constant ε 0, and we simultaneously improve the search time to O(m2logm + (m + occ)logu). Both indexes support displaying any subtext of length ℓ in optimal O(ℓ/logσu) time. In addition, we show how the space can be squeezed to (1 + ε)uHk(T) + o(ulogσ) to obtain a structure with O(m2) average search time for $m \geqslant 2\log_\sigma{u}$.