Parameterised compression for sparse bitmaps
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Self-indexing inverted files for fast text retrieval
ACM Transactions on Information Systems (TOIS)
An efficient bitmap encoding scheme for selection queries
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Model 204 Architecture and Performance
Proceedings of the 2nd International Workshop on High Performance Transaction Systems
Compressing Bitmap Indexes for Faster Search Operations
SSDBM '02 Proceedings of the 14th International Conference on Scientific and Statistical Database Management
Preface: special issue on data management in bioinformatics
Information Systems - Special issue: Data management in bioinformatics
Byte-aligned bitmap compression
DCC '95 Proceedings of the Conference on Data Compression
Compressing Bitmap Indices by Data Reorganization
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Integrating compression and execution in column-oriented database systems
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Multi-resolution bitmap indexes for scientific data
ACM Transactions on Database Systems (TODS)
VLDB '85 Proceedings of the 11th international conference on Very Large Data Bases - Volume 11
Analysis of Basic Data Reordering Techniques
SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Histogram-aware sorting for enhanced word-aligned compression in bitmap indexes
Proceedings of the ACM 11th international workshop on Data warehousing and OLAP
Directly Addressable Variable-Length Codes
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Sorting improves word-aligned bitmap indexes
Data & Knowledge Engineering
Position list word aligned hybrid: optimizing space and performance for compressed bitmaps
Proceedings of the 13th International Conference on Extending Database Technology
Reordering columns for smaller indexes
Information Sciences: an International Journal
Dynamic bitmap index recompression through workload-based optimizations
Proceedings of the 17th International Database Engineering & Applications Symposium
Hi-index | 0.00 |
Modern large-scale applications are generating staggering amounts of data. In an effort to summarize and index these data sets, databases often use bitmap indices. These indices have become widely adopted due to their dual properties of (1) being able to leverage fast bit-wise operations for query processing and (2) compressibility. Today, two pervasive bitmap compression schemes employ a variation of run-length encoding, aligned over bytes (BBC) and words (WAH), respectively. While BBC typically offers high compression ratios, WAH can achieve faster query processing, but often at the cost of space. Recent work has further shown that reordering the rows of a bitmap can dramatically increase compression. However, these sorted bitmaps often display patterns of changing run-lengths that are not optimal for a byte nor a word alignment. We present a general framework to facilitate a variable length compression scheme. Given a bitmap, our algorithm is able to use different encoding lengths for compression on a per-column basis. We further present an algorithm that efficiently processes queries when encoding lengths share a common integer factor. Our empirical study shows that in the best case our approach can out-compress BBC by 30% and WAH by 70%, for real data sets. Furthermore, we report a query processing speedup of 1.6× over BBC and 1.25× over WAH. We will also show that these numbers drastically improve in our synthetic, uncorrelated data sets.