Compact pat trees
Succinct indexable dictionaries with applications to encoding k-ary trees and multisets
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
High-order entropy-compressed text indexes
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Succinct Dynamic Data Structures
WADS '01 Proceedings of the 7th International Workshop on Algorithms and Data Structures
Journal of the ACM (JACM)
Rank/select operations on large alphabets: a tool for text indexing
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
ACM Computing Surveys (CSUR)
Compressed representations of sequences and full-text indexes
ACM Transactions on Algorithms (TALG)
Succinct indexes for strings, binary relations and multi-labeled trees
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Succinct representations of permutations
ICALP'03 Proceedings of the 30th international conference on Automata, languages and programming
A fast and compact web graph representation
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
Directly Addressable Variable-Length Codes
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
A New Point Access Method Based on Wavelet Trees
ER '09 Proceedings of the ER 2009 Workshops (CoMoL, ETheCoM, FP-UML, MOST-ONISW, QoIS, RIGiM, SeCoGIS) on Advances in Conceptual Modeling - Challenging Perspectives
Fast and Compact Web Graph Representations
ACM Transactions on the Web (TWEB)
Efficient set intersection for inverted indexing
ACM Transactions on Information Systems (TOIS)
Data structures: time, I/Os, entropy, joules!
ESA'10 Proceedings of the 18th annual European conference on Algorithms: Part II
Compressed self-indices supporting conjunctive queries on document collections
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Practical compressed document retrieval
SEA'11 Proceedings of the 10th international conference on Experimental algorithms
Fixed block compression boosting in FM-indexes
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Space efficient wavelet tree construction
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Practical representations for web and social graphs
Proceedings of the 20th ACM international conference on Information and knowledge management
Word-based self-indexes for natural language text
ACM Transactions on Information Systems (TOIS)
Practical compressed suffix trees
SEA'10 Proceedings of the 9th international conference on Experimental Algorithms
Extended compact web graph representations
Algorithms and Applications
The wavelet trie: maintaining an indexed sequence of strings in compressed space
PODS '12 Proceedings of the 31st symposium on Principles of Database Systems
Distributed search based on self-indexed compressed text
Information Processing and Management: an International Journal
To index or not to index: time-space trade-offs in search engines with positional ranking functions
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Fast, small, simple rank/select on bitmaps
SEA'12 Proceedings of the 11th international conference on Experimental Algorithms
CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Compressed representation of web and social networks via dense subgraphs
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Dual-Sorted inverted lists in practice
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Smaller self-indexes for natural language
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Implicit indexing of natural language text by reorganizing bytecodes
Information Retrieval
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Journal of Discrete Algorithms
Hi-index | 0.00 |
We present a practical study on the compact representation of sequences supporting rank , select , and access queries. While there are several theoretical solutions to the problem, only a few have been tried out, and there is little idea on how the others would perform, especially in the case of sequences with very large alphabets. We first present a new practical implementation of the compressed representation for bit sequences proposed by Raman, Raman, and Rao [SODA 2002], that is competitive with the existing ones when the sequences are not too compressible. It also has nice local compression properties, and we show that this makes it an excellent tool for compressed text indexing in combination with the Burrows-Wheeler transform. This shows the practicality of a recent theoretical proposal [Mäkinen and Navarro, SPIRE 2007], achieving spaces never seen before. Second, for general sequences, we tune wavelet trees for the case of very large alphabets, by removing their pointer information. We show that this gives an excellent solution for representing a sequence within zero-order entropy space, in cases where the large alphabet poses a serious challenge to typical encoding methods. We also present the first implementation of Golynski et al.'s representation [SODA 2006], which offers another interesting time/space trade-off.