Hardware acceleration of database operations

Authors:
Jared Casper;Kunle Olukotun
Affiliations:
Stanford University, Stanford, CA, USA;Stanford University, Stanford, CA, USA
Venue:
Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays
Year:
2014

Citing 14
Cited 0

Parameter variations and impact on circuits and microarchitecture

Proceedings of the 40th annual Design Automation Conference
Fast parallel GPU-sorting using a hybrid algorithm

Journal of Parallel and Distributed Computing
Efficient implementation of sorting on multi-core SIMD CPU architecture

Proceedings of the VLDB Endowment
Designing efficient sorting algorithms for manycore GPUs

IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Sort vs. Hash revisited: fast join implementation on modern multi-core CPUs

Proceedings of the VLDB Endowment
Streams on wires: a query compiler for FPGAs

Proceedings of the VLDB Endowment
Data processing on FPGAs

Proceedings of the VLDB Endowment
Fast sort on CPUs and GPUs: a case for bandwidth oblivious SIMD sort

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Glacier: a query-to-hardware compiler

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
The future of microprocessors

Communications of the ACM
FPGASort: a high performance sorting architecture exploiting run-time reconfiguration on fpgas for large problem sorting

Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
CudaDMA: optimizing GPU memory bandwidth via warp specialization

Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
GPU join processing revisited

DaMoN '12 Proceedings of the Eighth International Workshop on Data Management on New Hardware
Database analytics acceleration using FPGAs

Proceedings of the 21st international conference on Parallel architectures and compilation techniques

Quantified Score

Hi-index	0.00

Visualization

Abstract

As the amount of memory in database systems grows, entire database tables, or even databases, are able to fit in the system's memory, making in-memory database operations more prevalent. This shift from disk-based to in-memory database systems has contributed to a move from row-wise to columnar data storage. Furthermore, common database workloads have grown beyond online transaction processing (OLTP) to include online analytical processing and data mining. These workloads analyze huge datasets that are often irregular and not indexed, making traditional database operations like joins much more expensive. In this paper we explore using dedicated hardware to accelerate in-memory database operations. We present hardware to accelerate the selection process of compacting a single column into a linear column of selected data, joining two sorted columns via merging, and sorting a column. Finally, we put these primitives together to accelerate an entire join operation. We implement a prototype of this system using FPGAs and show substantial improvements in both absolute throughput and utilization of memory bandwidth. Using the prototype as a guide, we explore how the hardware resources required by our design change with the desired throughput.