Meet the walkers: accelerating index traversals for in-memory databases

Authors:
Onur Kocberber;Boris Grot;Javier Picorel;Babak Falsafi;Kevin Lim;Parthasarathy Ranganathan
Affiliations:
EcoCloud, EPFL;University of Edinburgh;EcoCloud, EPFL;EcoCloud, EPFL;HP Labs;Google Inc.
Venue:
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Year:
2013

Citing 24
Cited 0

The Gamma Database Machine Project

IEEE Transactions on Knowledge and Data Engineering
Optimizing Main-Memory Join on Modern Hardware

IEEE Transactions on Knowledge and Data Engineering
SMARTS: accelerating microarchitecture simulation via rigorous statistical sampling

Proceedings of the 30th annual international symposium on Computer architecture
Accelerating database operators using a network processor

DaMoN '05 Proceedings of the 1st international workshop on Data management on new hardware
SimFlex: Statistical Sampling of Computer System Simulation

IEEE Micro
Improving hash join performance through prefetching

ACM Transactions on Database Systems (TODS)
Why you should run TPC-DS: a workload analysis

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Sort vs. Hash revisited: fast join implementation on modern multi-core CPUs

Proceedings of the VLDB Endowment
Glacier: a query-to-hardware compiler

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Understanding sources of inefficiency in general-purpose chips

Proceedings of the 37th annual international symposium on Computer architecture
Design and evaluation of main memory hash join algorithms for multi-core CPUs

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Memory power management via dynamic voltage/frequency scaling

Proceedings of the 8th ACM international conference on Autonomic computing
Dark silicon and the end of multicore scaling

Proceedings of the 38th annual international symposium on Computer architecture
Efficient complex operators for irregular codes

HPCA '11 Proceedings of the 2011 IEEE 17th International Symposium on High Performance Computer Architecture
Dynamically Specialized Datapaths for energy efficient computing

HPCA '11 Proceedings of the 2011 IEEE 17th International Symposium on High Performance Computer Architecture
Toward Dark Silicon in Servers

IEEE Micro
Bundled execution of recurring traces for energy-efficient general purpose processing

Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
QsCores: trading dark silicon for scalable energy efficiency with quasi-specific cores

Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Scale-out processors

Proceedings of the 39th Annual International Symposium on Computer Architecture
Vector Extensions for Decision Support DBMS Acceleration

MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Navigating big data with high-throughput, energy-efficient data partitioning

Proceedings of the 40th Annual International Symposium on Computer Architecture
LINQits: big data on little clients

Proceedings of the 40th Annual International Symposium on Computer Architecture
Main-memory hash joins on multi-core CPUs: Tuning to the underlying hardware

ICDE '13 Proceedings of the 2013 IEEE International Conference on Data Engineering (ICDE 2013)

Quantified Score

Hi-index	0.00

Visualization

Abstract

The explosive growth in digital data and its growing role in real-time decision support motivate the design of high-performance database management systems (DBMSs). Meanwhile, slowdown in supply voltage scaling has stymied improvements in core performance and ushered an era of power-limited chips. These developments motivate the design of DBMS accelerators that (a) maximize utility by accelerating the dominant operations, and (b) provide flexibility in the choice of DBMS, data layout, and data types. We study data analytics workloads on contemporary in-memory databases and find hash index lookups to be the largest single contributor to the overall execution time. The critical path in hash index lookups consists of ALU-intensive key hashing followed by pointer chasing through a node list. Based on these observations, we introduce Widx, an on-chip accelerator for database hash index lookups, which achieves both high performance and flexibility by (1) decoupling key hashing from the list traversal, and (2) processing multiple keys in parallel on a set of programmable walker units. Widx reduces design cost and complexity through its tight integration with a conventional core, thus eliminating the need for a dedicated TLB and cache. An evaluation of Widx on a set of modern data analytics workloads (TPC-H, TPC-DS) using full-system simulation shows an average speedup of 3.1x over an aggressive OoO core on bulk hash table operations, while reducing the OoO core energy by 83%.