Systolic (VLSI) arrays for relational database operations
SIGMOD '80 Proceedings of the 1980 ACM SIGMOD international conference on Management of data
Optimizing Main-Memory Join on Modern Hardware
IEEE Transactions on Knowledge and Data Engineering
Control Versus Data Flow in Parallel Database Machines
IEEE Transactions on Parallel and Distributed Systems
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Cache Conscious Algorithms for Relational Query Processing
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Super-Scalar RAM-CPU Cache Compression
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
An integrated efficient solution for computing frequent and top-k elements in data streams
ACM Transactions on Database Systems (TODS)
FastForward for efficient pipeline parallelism: a cache-optimized concurrent lock-free queue
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Finding frequent items in data streams
Proceedings of the VLDB Endowment
Parallel Data Mining on Multicore Clusters
GCC '08 Proceedings of the 2008 Seventh International Conference on Grid and Cooperative Computing
Frequent items in streaming data: An experimental evaluation of the state-of-the-art
Data & Knowledge Engineering
The multikernel: a new OS architecture for scalable multicore systems
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Memory Performance and Cache Coherency Effects on an Intel Nehalem Multiprocessor System
PACT '09 Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques
Sort vs. Hash revisited: fast join implementation on modern multi-core CPUs
Proceedings of the VLDB Endowment
Thread cooperation in multicore architectures for frequency counting over multiple data streams
Proceedings of the VLDB Endowment
SIMD-scan: ultra fast in-memory table scan using on-chip vector processing units
Proceedings of the VLDB Endowment
Design and evaluation of main memory hash join algorithms for multi-core CPUs
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
How soccer players would do stream joins
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Scalable aggregation on multicore processors
Proceedings of the Seventh International Workshop on Data Management on New Hardware
Finding frequent items in parallel
Concurrency and Computation: Practice & Experience
High throughput heavy hitter aggregation for modern SIMD processors
Proceedings of the Ninth International Workshop on Data Management on New Hardware
Accelerating frequent item counting with FPGA
Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays
Hi-index | 0.00 |
The increasing number of cores and the rich instruction sets of modern hardware are opening up new opportunities for optimizing many traditional data mining tasks. In this paper we demonstrate how to speed up the performance of the computation of frequent items by almost one order of magnitude over the best published results by matching the algorithm to the underlying hardware architecture. We start with the observation that frequent item counting, like other data mining tasks, assumes certain amount of skew in the data. We exploit this skew to design a new algorithm that uses a pre-filtering stage that can be implemented in a highly efficient manner through SIMD instructions. Using pipelining, we then combine this pre-filtering stage with a conventional frequent item algorithm (Space-Saving) that will process the remainder of the data. The resulting operator can be parallelized with a small number of cores, leading to a parallel implementation that does not suffer any of the overheads of existing parallel solutions when querying the results and offers significantly higher throughput.