Ameliorating memory contention of OLAP operators on GPU processors

Authors:
Evangelia A. Sitaridi;Kenneth A. Ross
Affiliations:
Columbia University;Columbia University
Venue:
DaMoN '12 Proceedings of the Eighth International Workshop on Data Management on New Hardware
Year:
2012

Citing 21
Cited 7

Query optimization in a memory-resident domain relational calculus database system

ACM Transactions on Database Systems (TODS)
Parity declustering for continuous operation in redundant disk arrays

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
A decomposition storage model

SIGMOD '85 Proceedings of the 1985 ACM SIGMOD international conference on Management of data
Efficient Main Memory Data Management Using the DBGraph Storage Model

VLDB '90 Proceedings of the 16th International Conference on Very Large Data Bases
Improving OLAP Performance by Multidimensional Hierarchical Clustering

IDEAS '99 Proceedings of the 1999 International Symposium on Database Engineering & Applications
Multi-dimensional clustering: a new data layout scheme in DB2

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Cuckoo hashing

Journal of Algorithms
Fast computation of database operations using graphics processors

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
GPUQP: query co-processing using graphics processors

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Efficient gather and scatter operations on graphics processors

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Parallel buffers for chip multiprocessors

DaMoN '07 Proceedings of the 3rd international workshop on Data management on new hardware
Relational joins on graphics processors

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Mars: a MapReduce framework on graphics processors

Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Kinesis: A new approach to replica placement in distributed storage systems

ACM Transactions on Storage (TOS)
Relational query coprocessing on graphics processors

ACM Transactions on Database Systems (TODS)
Accelerating SQL database operations on a GPU with CUDA

Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units
FAST: fast architecture sensitive tree search on modern CPUs and GPUs

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Automatic contention detection and amelioration for data-intensive operations

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Supporting extended precision on graphics processors

Proceedings of the Sixth International Workshop on Data Management on New Hardware
Database compression on graphics processors

Proceedings of the VLDB Endowment
Scalable aggregation on multicore processors

Proceedings of the Seventh International Workshop on Data Management on New Hardware

High-performance online spatial and temporal aggregations on multi-core CPUs and many-core GPUs

Proceedings of the fifteenth international workshop on Data warehousing and OLAP
Data management systems on GPUs: promises and challenges

Proceedings of the 25th International Conference on Scientific and Statistical Database Management
Optimizing select conditions on GPUs

Proceedings of the Ninth International Workshop on Data Management on New Hardware
Data warehousing and OLAP over big data: current challenges and future research directions

Proceedings of the sixteenth international workshop on Data warehousing and OLAP
Evaluating integrated graphics processors for data center workloads

Proceedings of the Workshop on Power-Aware Computing and Systems
Big data: a research agenda

Proceedings of the 17th International Database Engineering & Applications Symposium
The Yin and Yang of processing data warehousing queries on GPU devices

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.00

Visualization

Abstract

Implementations of database operators on GPU processors have shown dramatic performance improvement compared to multicore-CPU implementations. GPU threads can cooperate using shared memory, which is organized in interleaved banks and is fast only when threads read and modify addresses belonging to distinct memory banks. Therefore, data processing operators implemented on a GPU, in addition to contention caused by popular values, have to deal with a new performance limiting factor: thread serialization when accessing values belonging to the same bank. Here, we define the problem of bank and value conflict optimization for data processing operators using the CUDA platform. To analyze the impact of these two factors on operator performance we use two database operations: foreignkey join and grouped aggregation. We suggest and evaluate techniques for optimizing the data arrangement offline by creating clones of values to reduce overall memory contention. Results indicate that columns used for writes, as grouping columns, need be optimized to fully exploit the maximum bandwidth of shared memory.