Scalable frequent itemset mining on many-core processors

Authors:
Benjamin Schlegel;Tomas Karnagel;Tim Kiefer;Wolfgang Lehner
Affiliations:
Technische Universität Dresden, Dresden, Germany;Technische Universität Dresden, Dresden, Germany;Technische Universität Dresden, Dresden, Germany;Technische Universität Dresden, Dresden, Germany
Venue:
Proceedings of the Ninth International Workshop on Data Management on New Hardware
Year:
2013

Citing 7
Cited 0

Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Parallel Algorithms for Discovery of Association Rules

Data Mining and Knowledge Discovery
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
New Algorithms for Fast Discovery of Association Rules

New Algorithms for Fast Discovery of Association Rules
Advances in frequent itemset mining implementations: report on FIMI'03

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Optimization of frequent itemset mining on multiple-core processor

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Fastest association rule mining algorithm predictor (FARM-AP)

Proceedings of The Fourth International C* Conference on Computer Science and Software Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Frequent-itemset mining is an essential part of the association rule mining process, which has many application areas. It is a computation and memory intensive task with many opportunities for optimization. Many efficient sequential and parallel algorithms were proposed in the recent years. Most of the parallel algorithms, however, cannot cope with the huge number of threads that are provided by large multiprocessor or many-core systems. In this paper, we provide a highly parallel version of the well-known Eclat algorithm. It runs on both, multiprocessor systems and many-core coprocessors, and scales well up to a very large number of threads---244 in our experiments. To evaluate mcEclat's performance, we conducted many experiments on realistic datasets. mcEclat achieves high speedups of up to 11.5x and 100x on a 12-core multiprocessor system and a 61-core Xeon Phi many-core coprocessor, respectively. Furthermore, mcEclat is competitive with highly optimized existing frequent-itemset mining implementations taken from the FIMI repository.