Surpassing the information theoretic bound with fusion trees
Journal of Computer and System Sciences - Special issue: papers from the 22nd ACM symposium on the theory of computing, May 14–16, 1990
Compact pat trees
Efficient Storage and Retrieval by Content and Address of Static Files
Journal of the ACM (JACM)
Multiple byte processing with full-word instructions
Communications of the ACM
Space-efficient static trees and graphs
SFCS '89 Proceedings of the 30th Annual Symposium on Foundations of Computer Science
Optimal lower bounds for rank and select indexes
ICALP'06 Proceedings of the 33rd international conference on Automata, Languages and Programming - Volume Part I
Efficient implementation of rank and select functions for succinct representation
WEA'05 Proceedings of the 4th international conference on Experimental and Efficient Algorithms
ACM SIGIR Forum
Compressed collections for simulated crawling
ACM SIGIR Forum
Broadword Computing and Fibonacci Code Speed Up Compressed Suffix Arrays
SEA '09 Proceedings of the 8th International Symposium on Experimental Algorithms
Storing the web in memory: space efficient language models with constant time retrieval
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Data structures: time, I/Os, entropy, joules!
ESA'10 Proceedings of the 18th annual European conference on Algorithms: Part II
Medium-space algorithms for inverse BWT
ESA'10 Proceedings of the 18th annual European conference on Algorithms: Part I
Algorithm engineering: bridging the gap between algorithm theory and practice
Algorithm engineering: bridging the gap between algorithm theory and practice
HyperANF: approximating the neighbourhood function of very large graphs on a budget
Proceedings of the 20th international conference on World wide web
Theory and practice of monotone minimal perfect hashing
Journal of Experimental Algorithmics (JEA)
Fixed block compression boosting in FM-indexes
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Semi-indexing semi-structured data in tiny space
Proceedings of the 20th ACM international conference on Information and knowledge management
Faster bit-parallel algorithms for unordered pseudo-tree matching and tree homeomorphism
Journal of Discrete Algorithms
Fast, small, simple rank/select on bitmaps
SEA'12 Proceedings of the 11th international conference on Experimental Algorithms
DACs: Bringing direct access to variable-length codes
Information Processing and Management: an International Journal
Proceedings of the sixth ACM international conference on Web search and data mining
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Space-efficient data structures for Top-k completion
Proceedings of the 22nd international conference on World Wide Web
Approximate pattern matching with k-mismatches in packed text
Information Processing Letters
Memory efficient sanitization of a deduplicated storage system
FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
Hi-index | 0.00 |
Research on succinct data structures (data structures occupying space close to the information-theoretical lower bound, but achieving speed similar to their standard counterparts) has steadily increased in the last few years. However, many theoretical constructions providing asymptotically optimal bounds are unusable in practise because of the very large constants involved. The study of practical implementations of the basic building blocks of such data structures is thus fundamental to obtain practical applications. In this paper we argue that 64-bit and wider architectures are particularly suited to very efficient implementations of rank (counting the number of ones up to a given position) and select (finding the position of the i -th bit set), two essential building blocks of all succinct data structures. Contrarily to typical 32-bit approaches, involving precomputed tables, we use pervasively broadword (a.k.a. SWAR--"SIMD in A Register") programming, which compensates the constant burden associated to succinct structures by solving problems in parallel in a register. We provide an implementation named rank9 that addresses 264 bits, consumes less space and is significantly faster then current state-of-the-art 32-bit implementations, and a companion select9 structure that selects in nearly constant time using only access to aligned data. For sparsely populated arrays, we provide a simple broadword implementation of the Elias-Fano representation of monotone sequences. In doing so, we develop broadword algorithms for performing selection in a word or in a sequence of words that are of independent interest.