Improved query performance with variant indexes
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Multiple byte processing with full-word instructions
Communications of the ACM
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Implementing database operations using SIMD instructions
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Super-Scalar RAM-CPU Cache Compression
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Integrating compression and execution in column-oriented database systems
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Row-wise parallel predicate evaluation
Proceedings of the VLDB Endowment
Constant-Time Query Processing
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Dictionary-based order-preserving string compression for main memory column stores
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
SIMD-scan: ultra fast in-memory table scan using on-chip vector processing units
Proceedings of the VLDB Endowment
Database compression on graphics processors
Proceedings of the VLDB Endowment
HYRISE: a main memory hybrid storage engine
Proceedings of the VLDB Endowment
Fast updates on read-optimized databases using multi-core CPUs
Proceedings of the VLDB Endowment
Vectorwise: A Vectorized Analytical DBMS
ICDE '12 Proceedings of the 2012 IEEE 28th International Conference on Data Engineering
BitWeaving: fast scans for main memory data processing
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
BitWeaving: fast scans for main memory data processing
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Hi-index | 0.00 |
This paper focuses on running scans in a main memory data processing system at "bare metal" speed. Essentially, this means that the system must aim to process data at or near the speed of the processor (the fastest component in most system configurations). Scans are common in main memory data processing environments, and with the state-of-the-art techniques it still takes many cycles per input tuple to apply simple predicates on a single column of a table. In this paper, we propose a technique called BitWeaving that exploits the parallelism available at the bit level in modern processors. BitWeaving operates on multiple bits of data in a single cycle, processing bits from different columns in each cycle. Thus, bits from a batch of tuples are processed in each cycle, allowing BitWeaving to drop the cycles per column to below one in some case. BitWeaving comes in two flavors: BitWeaving/V which looks like a columnar organization but at the bit level, and BitWeaving/H which packs bits horizontally. In this paper we also develop the arithmetic framework that is needed to evaluate predicates using these BitWeaving organizations. Our experimental results show that both these methods produce significant performance benefits over the existing state-of-the-art methods, and in some cases produce over an order of magnitude in performance improvement.