Parallel database systems: the future of high performance database systems
Communications of the ACM
Quickly generating billion-record synthetic databases
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Adaptive parallel aggregation algorithms
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Efficient mid-query re-optimization of sub-optimal query execution plans
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Eddies: continuously adaptive query processing
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
The Art of Computer Programming Volumes 1-3 Boxed Set
The Art of Computer Programming Volumes 1-3 Boxed Set
Optimizing Main-Memory Join on Modern Hardware
IEEE Transactions on Knowledge and Data Engineering
Fast computation of database operations using graphics processors
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Robust query processing through progressive optimization
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Accelerating database operators using a network processor
DaMoN '05 Proceedings of the 1st international workshop on Data management on new hardware
Realizing parallelism in database operations: insights from a massively multithreaded architecture
DaMoN '06 Proceedings of the 2nd international workshop on Data management on new hardware
Architecture-conscious hashing
DaMoN '06 Proceedings of the 2nd international workshop on Data management on new hardware
Adaptive query processing: why, how, when, what next
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Query co-processing on commodity processors
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Parallel buffers for chip multiprocessors
DaMoN '07 Proceedings of the 3rd international workshop on Data management on new hardware
Main-memory scan sharing for multi-core CPUs
Proceedings of the VLDB Endowment
CAM conscious integrated answering of frequent elements and top-k queries over data streams
Proceedings of the 4th international workshop on Data management on new hardware
Data partitioning on chip multiprocessors
Proceedings of the 4th international workshop on Data management on new hardware
Hash Join Optimization Based on Shared Cache Chip Multi-processor
DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Cache-conscious buffering for database operators with state
Proceedings of the Fifth International Workshop on Data Management on New Hardware
Relational query coprocessing on graphics processors
ACM Transactions on Database Systems (TODS)
Sort vs. Hash revisited: fast join implementation on modern multi-core CPUs
Proceedings of the VLDB Endowment
A scalable, predictable join operator for highly concurrent data warehouses
Proceedings of the VLDB Endowment
MCC-DB: minimizing cache conflicts in multi-core processors for databases
Proceedings of the VLDB Endowment
Improving the performance of list intersection
Proceedings of the VLDB Endowment
Suffix tree construction algorithms on modern hardware
Proceedings of the 13th International Conference on Extending Database Technology
FAST: fast architecture sensitive tree search on modern CPUs and GPUs
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Automatic contention detection and amelioration for data-intensive operations
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Fast and compact hash tables for integer keys
ACSC '09 Proceedings of the Thirty-Second Australasian Conference on Computer Science - Volume 91
Comprehensive on-chip traffic generator model for SoC design and synthesis
SpringSim '10 Proceedings of the 2010 Spring Simulation Multiconference
Predictable performance and high query concurrency for data analytics
The VLDB Journal — The International Journal on Very Large Data Bases
Scalable aggregation on multicore processors
Proceedings of the Seventh International Workshop on Data Management on New Hardware
Efficiently compiling efficient query plans for modern hardware
Proceedings of the VLDB Endowment
Designing fast architecture-sensitive tree search on modern multicore/many-core processors
ACM Transactions on Database Systems (TODS)
Fast updates on read-optimized databases using multi-core CPUs
Proceedings of the VLDB Endowment
Aggregation strategies for columnar in-memory databases in a mixed workload
Proceedings of the 4th workshop on Workshop for Ph.D. students in information & knowledge management
Clydesdale: structured data processing on hadoop
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
CloudRAMSort: fast and efficient large-scale distributed RAM sort on shared-nothing cluster
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
DaMoN '12 Proceedings of the Eighth International Workshop on Data Management on New Hardware
Adaptive MapReduce using situation-aware mappers
Proceedings of the 15th International Conference on Extending Database Technology
An in-depth analysis of data aggregation cost factors in a columnar in-memory database
Proceedings of the fifteenth international workshop on Data warehousing and OLAP
High throughput heavy hitter aggregation for modern SIMD processors
Proceedings of the Ninth International Workshop on Data Management on New Hardware
Sharing data and work across concurrent analytical queries
Proceedings of the VLDB Endowment
Eliminating unscalable communication in transaction processing
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
The recent introduction of commodity chip multiprocessors requires that the design of core database operations be carefully examined to take full advantage of on-chip parallelism. In this paper we examine aggregation in a multi-core environment, the Sun UltraSPARC T1, a chip multiprocessor with eight cores and a shared L2 cache. Aggregation is an important aspect of query processing that is seemingly easy to understand and implement. Our research, however, demonstrates that a chip multiprocessor adds new dimensions to understanding hash-based aggregation performance---concurrent sharing of aggregation data structures and contentious accesses to frequently used values. We also identify a trade off between private data structures assigned to each thread versus shared data structures for aggregation. Depending on input characteristics, different aggregation strategies are optimal and choosing the wrong strategy can result in a performance penalty of over an order of magnitude. We provide a thorough explanation of the factors affecting aggregation performance on chip multiprocessors and identify three key input characteristics that dictate performance: (1) average run length of identical group-by values, (2) locality of references to the aggregation hash table, and (3) frequency of repeated accesses to the same hash table location. We then introduce an adaptive aggregation operator that performs lightweight sampling of the input to choose the correct aggregation strategy with high accuracy. Our experiments verify that our adaptive algorithm chooses the highest performing aggregation strategy on a number of common input distributions.