Storage and retrieval considerations of binary data bases
Information Processing and Management: an International Journal
Characterization of database access skew in a transaction processing environment
SIGMETRICS '92/PERFORMANCE '92 Proceedings of the 1992 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Space efficient bitmap indexing
Proceedings of the ninth international conference on Information and knowledge management
Characterization of database access pattern for analytic prediction of buffer hit probability
The VLDB Journal — The International Journal on Very Large Data Bases
Model 204 Architecture and Performance
Proceedings of the 2nd International Workshop on High Performance Transaction Systems
LEO - DB2's LEarning Optimizer
Proceedings of the 27th International Conference on Very Large Data Bases
Towards Automated Performance Tuning for Complex Workloads
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Compressing Bitmap Indexes for Faster Search Operations
SSDBM '02 Proceedings of the 14th International Conference on Scientific and Statistical Database Management
Preface: special issue on data management in bioinformatics
Information Systems - Special issue: Data management in bioinformatics
KDD-cup 99: knowledge discovery in a charitable organization's donor database
ACM SIGKDD Explorations Newsletter
Byte-aligned bitmap compression
DCC '95 Proceedings of the Conference on Data Compression
Compressing Bitmap Indices by Data Reorganization
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Optimizing candidate check costs for bitmap indices
Proceedings of the 14th ACM international conference on Information and knowledge management
VLDB '85 Proceedings of the 11th international conference on Very Large Data Bases - Volume 11
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Analysis of Basic Data Reordering Techniques
SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Sorting improves word-aligned bitmap indexes
Data & Knowledge Engineering
Position list word aligned hybrid: optimizing space and performance for compressed bitmaps
Proceedings of the 13th International Conference on Extending Database Technology
NET-FLi: on-the-fly compression, archiving and indexing of streaming network traffic
Proceedings of the VLDB Endowment
Variable length compression for bitmap indices
DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part II
Minimizing index size by reordering rows and columns
SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
Hi-index | 0.00 |
Many large-scale read-only databases and data warehouses use bitmap indices in an effort to speed up data analysis. These indices have the dual properties of compressibility and being able to leverage fast bit-wise operations for query processing. Numerous hybrid run-length encoding compression schemes have been proposed that greatly compress the index and enable querying without the need to decompress. Typically, these schemes align their compression with the computer architecture's word size to further accelerate queries. Previously, we introduced Variable Length Compression (VLC), which uses a general encoding that can achieve better compression than word-aligned schemes. However, VLC's querying efficiency can vary widely due to mismatched alignment of compressed columns. In this paper, we present an optimizer which recompresses the bitmap over time. Based on query history, our approach allows the VLC user to specify the priority of compression versus query efficiency, then possibly recompress the bitmap accordingly. In an empirical study using scientific data sets, we showed that our approach was able to achieve both better compression ratios and query speedup over WAH and PLWAH. On the largest data set, our VLC optimizer compressed up to 1.73x better than WAH, and 1.46x over PLWAH. We also show a slight improvement in query efficiency in most experiments, while observing lucrative (11x to 16x) speedup in special cases.