Data compression and Gray-code sorting
Information Processing Letters
The design and analysis of spatial data structures
The design and analysis of spatial data structures
The hB-tree: a multiattribute indexing method with good guaranteed performance
ACM Transactions on Database Systems (TODS)
A cost model for nearest neighbor search in high-dimensional data space
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Bitmap index design and evaluation
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Multidimensional access methods
ACM Computing Surveys (CSUR)
An efficient bitmap encoding scheme for selection queries
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Improving performance of sparse matrix-vector multiplication
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
A performance comparison of bitmap indexes
Proceedings of the tenth international conference on Information and knowledge management
ACM Computing Surveys (CSUR)
Data Compression: The Complete Reference
Data Compression: The Complete Reference
The TV-tree: an index structure for high-dimensional data
The VLDB Journal — The International Journal on Very Large Data Bases - Spatial Database Systems
Model 204 Architecture and Performance
Proceedings of the 2nd International Workshop on High Performance Transaction Systems
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Performance Measurements of Compressed Bitmap Indices
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Optimizing Queries on Compressed Bitmaps
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Compressing Bitmap Indexes for Faster Search Operations
SSDBM '02 Proceedings of the 14th International Conference on Scientific and Statistical Database Management
Bitmap Indices for Speeding Up High-Dimensional Data Analysis
DEXA '02 Proceedings of the 13th International Conference on Database and Expert Systems Applications
Multidimensional Indexing and Query Coordination for Tertiary Storage Management
SSDBM '99 Proceedings of the 11th International Conference on Scientific and Statistical Database Management
The Hybrid Tree: An Index Structure for High Dimensional Feature Spaces
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Clustering High Dimensional Massive Scientific Datasets
SSDBM '01 Proceedings of the 13th International Conference on Scientific and Statistical Database Management
Compressing large boolean matrices using reordering techniques
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
The traveling salesman: computational solutions for TSP applications
The traveling salesman: computational solutions for TSP applications
Approximate encoding for direct access and query processing over compressed bitmaps
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
GraphScope: parameter-free mining of large time-evolving graphs
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
RLH: bitmap compression technique based on run-length and huffman encoding
Proceedings of the ACM tenth international workshop on Data warehousing and OLAP
Analysis of Basic Data Reordering Techniques
SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Histogram-aware sorting for enhanced word-aligned compression in bitmap indexes
Proceedings of the ACM 11th international workshop on Data warehousing and OLAP
Dynamic data organization for bitmap indices
Proceedings of the 3rd international conference on Scalable information systems
Secondary bitmap indexes with vertical and horizontal partitioning
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
RLH: Bitmap compression technique based on run-length and Huffman encoding
Information Systems
Long paths in hypercubes with a quadratic number of faults
Information Sciences: an International Journal
Inverted indexes vs. bitmap indexes in decision support systems
Proceedings of the 18th ACM conference on Information and knowledge management
Sorting improves word-aligned bitmap indexes
Data & Knowledge Engineering
NET-FLi: on-the-fly compression, archiving and indexing of streaming network traffic
Proceedings of the VLDB Endowment
Variable length compression for bitmap indices
DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part II
ISABELA-QA: query-driven analytics with ISABELA-compressed extreme-scale scientific data
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Real-time creation of bitmap indexes on streaming network data
The VLDB Journal — The International Journal on Very Large Data Bases
Reordering rows for better compression: Beyond the lexicographic order
ACM Transactions on Database Systems (TODS)
Minimizing index size by reordering rows and columns
SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
Dynamic bitmap index recompression through workload-based optimizations
Proceedings of the 17th International Database Engineering & Applications Symposium
Hi-index | 0.00 |
Many scientific applications generate massive volumes of data through observations or computer simulations, bringing up the need for effective indexing methods for efficient storage and retrieval of scientific data. Unlike conventional databases, scientific data is mostly read-only and its volume can reach to the order of petabytes, making a compact index structure vital. Bitmap indexing has been successfully applied to scientific databases by exploiting the fact that scientific data are enumerated or numerical. Bitmap indices can be compressed with variants of run length encoding for a compact index structure. However even this may not be enough for the enormous data generated in some applications such as high energy physics. In this paper, we study how to reorganize bitmap tables for improved compression rates. Our algorithms are used just as a preprocessing step, thus there is no need to revise the current indexing techniques and the query processing algorithms. We introduce the tuple reordering problem, which aims to reorganize database tuples for optimal compression rates. We propose Gray code ordering algorithm for this NP-Complete problem, which is an inplace algorithm, and runs in linear time in the order of the size of the database. We also discuss how the tuple reordering problem can be reduced to the traveling salesperson problem. Our experimental results on real data sets show that the compression ratio can be improved by a factor of 2 to 10.