DB2 with BLU acceleration: so much more than just a column store

  • Authors:
  • Vijayshankar Raman;Gopi Attaluri;Ronald Barber;Naresh Chainani;David Kalmuk;Vincent KulandaiSamy;Jens Leenstra;Sam Lightstone;Shaorong Liu;Guy M. Lohman;Tim Malkemus;Rene Mueller;Ippokratis Pandis;Berni Schiefer;David Sharpe;Richard Sidle;Adam Storm;Liping Zhang

  • Affiliations:
  • IBM Research;IBM Software Group;IBM Research;IBM Software Group;IBM Software Group;IBM Software Group;IBM Systems & Technology Group;IBM Software Group;IBM Software Group;IBM Research;IBM Research;IBM Research;IBM Research;IBM Software Group;IBM Software Group;IBM Research;IBM Software Group;IBM Software Group

  • Venue:
  • Proceedings of the VLDB Endowment
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

DB2 with BLU Acceleration deeply integrates innovative new techniques for defining and processing column-organized tables that speed read-mostly Business Intelligence queries by 10 to 50 times and improve compression by 3 to 10 times, compared to traditional row-organized tables, without the complexity of defining indexes or materialized views on those tables. But DB2 BLU is much more than just a column store. Exploiting frequency-based dictionary compression and main-memory query processing technology from the Blink project at IBM Research - Almaden, DB2 BLU performs most SQL operations - predicate application (even range predicates and IN-lists), joins, and grouping - on the compressed values, which can be packed bit-aligned so densely that multiple values fit in a register and can be processed simultaneously via SIMD (single-instruction, multipledata) instructions. Designed and built from the ground up to exploit modern multi-core processors, DB2 BLU's hardware-conscious algorithms are carefully engineered to maximize parallelism by using novel data structures that need little latching, and to minimize data-cache and instruction-cache misses. Though DB2 BLU is optimized for in-memory processing, database size is not limited by the size of main memory. Fine-grained synopses, late materialization, and a new probabilistic buffer pool protocol for scans minimize disk I/Os, while aggressive prefetching reduces I/O stalls. Full integration with DB2 ensures that DB2 with BLU Acceleration benefits from the full functionality and robust utilities of a mature product, while still enjoying order-of-magnitude performance gains from revolutionary technology without even having to change the SQL, and can mix column-organized and row-organized tables in the same tablespace and even within the same query.