Query optimization in a memory-resident domain relational calculus database system
ACM Transactions on Database Systems (TODS)
Parity declustering for continuous operation in redundant disk arrays
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
SIGMOD '85 Proceedings of the 1985 ACM SIGMOD international conference on Management of data
Efficient Main Memory Data Management Using the DBGraph Storage Model
VLDB '90 Proceedings of the 16th International Conference on Very Large Data Bases
Improving OLAP Performance by Multidimensional Hierarchical Clustering
IDEAS '99 Proceedings of the 1999 International Symposium on Database Engineering & Applications
Multi-dimensional clustering: a new data layout scheme in DB2
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Journal of Algorithms
Fast computation of database operations using graphics processors
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
GPUQP: query co-processing using graphics processors
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Efficient gather and scatter operations on graphics processors
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Parallel buffers for chip multiprocessors
DaMoN '07 Proceedings of the 3rd international workshop on Data management on new hardware
Relational joins on graphics processors
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Mars: a MapReduce framework on graphics processors
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Kinesis: A new approach to replica placement in distributed storage systems
ACM Transactions on Storage (TOS)
Relational query coprocessing on graphics processors
ACM Transactions on Database Systems (TODS)
Accelerating SQL database operations on a GPU with CUDA
Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units
FAST: fast architecture sensitive tree search on modern CPUs and GPUs
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Automatic contention detection and amelioration for data-intensive operations
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Supporting extended precision on graphics processors
Proceedings of the Sixth International Workshop on Data Management on New Hardware
Database compression on graphics processors
Proceedings of the VLDB Endowment
Scalable aggregation on multicore processors
Proceedings of the Seventh International Workshop on Data Management on New Hardware
High-performance online spatial and temporal aggregations on multi-core CPUs and many-core GPUs
Proceedings of the fifteenth international workshop on Data warehousing and OLAP
Data management systems on GPUs: promises and challenges
Proceedings of the 25th International Conference on Scientific and Statistical Database Management
Optimizing select conditions on GPUs
Proceedings of the Ninth International Workshop on Data Management on New Hardware
Data warehousing and OLAP over big data: current challenges and future research directions
Proceedings of the sixteenth international workshop on Data warehousing and OLAP
Evaluating integrated graphics processors for data center workloads
Proceedings of the Workshop on Power-Aware Computing and Systems
Proceedings of the 17th International Database Engineering & Applications Symposium
The Yin and Yang of processing data warehousing queries on GPU devices
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
Implementations of database operators on GPU processors have shown dramatic performance improvement compared to multicore-CPU implementations. GPU threads can cooperate using shared memory, which is organized in interleaved banks and is fast only when threads read and modify addresses belonging to distinct memory banks. Therefore, data processing operators implemented on a GPU, in addition to contention caused by popular values, have to deal with a new performance limiting factor: thread serialization when accessing values belonging to the same bank. Here, we define the problem of bank and value conflict optimization for data processing operators using the CUDA platform. To analyze the impact of these two factors on operator performance we use two database operations: foreignkey join and grouped aggregation. We suggest and evaluate techniques for optimizing the data arrangement offline by creating clones of values to reduce overall memory contention. Results indicate that columns used for writes, as grouping columns, need be optimized to fully exploit the maximum bandwidth of shared memory.