Compact pat trees
Succinct indexable dictionaries with applications to encoding k-ary trees and multisets
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
High-order entropy-compressed text indexes
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Low Redundancy in Static Dictionaries with O(1) Worst Case Lookup Time
ICAL '99 Proceedings of the 26th International Colloquium on Automata, Languages and Programming
Succinct static data structures
Succinct static data structures
When indexing equals compression: experiments with compressing suffix arrays and applications
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
The webgraph framework I: compression techniques
Proceedings of the 13th international conference on World Wide Web
New text indexing functionalities of the compressed suffix arrays
Journal of Algorithms
Compressed Data Structures: Dictionaries and Data-Aware Measures
DCC '06 Proceedings of the Data Compression Conference
Lightweight natural language text compression
Information Retrieval
ACM Computing Surveys (CSUR)
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
An efficient compression code for text databases
ECIR'03 Proceedings of the 25th European conference on IR research
Hi-index | 0.00 |
In this paper, we study different approaches for rank and select on sequences of bytes and propose new implementation strategies. Extensive experimental evaluation comparing the efficiency of the different alternatives are provided. Given a sequence of bits, a rank query counts the number of occurrences of the bit 1 up to a given position, and a select query returns the position of the ith occurrence of the bit 1. These operations are widely used in information retrieval and management, being the base of several data structures and algorithms for text collections, graphs, etc. There exist solutions for computing these operations on sequences of bits in constant time using additional information. However, new applications require rank and select to be computed on sequences of bytes instead of bits. The solutions for the binary case are not directly applicable to sequences of bytes. The existing solutions for the byte case vary in their space-time trade-off which can still be improved.