A highly parallel algorithm for frequent itemset mining

Authors:
Alejandro Mesa;Claudia Feregrino-Uribe;René Cumplido;José Hernández-Palancar
Affiliations:
Advanced Technologies Application Center, La Habana, Cuba and National Institute for Astrophysics, Optics and Electronics, Puebla, México;National Institute for Astrophysics, Optics and Electronics, Puebla, México;National Institute for Astrophysics, Optics and Electronics, Puebla, México;Advanced Technologies Application Center, La Habana, Cuba
Venue:
MCPR'10 Proceedings of the 2nd Mexican conference on Pattern recognition: Advances in pattern recognition
Year:
2010

Citing 8
Cited 0

An effective hash-based algorithm for mining association rules

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Scalable parallel data mining for association rules

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient Hardware Data Mining with the Apriori Algorithm on FPGAs

FCCM '05 Proceedings of the 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
An Architecture for Efficient Hardware Data Mining using Reconfigurable Computing Systems

FCCM '06 Proceedings of the 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Hardware-Enhanced Association Rule Mining with Hashing and Pipelining

IEEE Transactions on Knowledge and Data Engineering
Distributed and Shared Memory Algorithm for Parallel Mining of Association Rules

MLDM '07 Proceedings of the 5th international conference on Machine Learning and Data Mining in Pattern Recognition
A Reconfigurable Platform for Frequent Pattern Mining

RECONFIG '08 Proceedings of the 2008 International Conference on Reconfigurable Computing and FPGAs

Quantified Score

Hi-index	0.00

Visualization

Abstract

Mining frequent itemsets in large databases is a widely used technique in Data Mining. Several sequential and parallel algorithms have been developed, although, when dealing with high data volumes, the execution of those algorithms takes more time and resources than expected. Because of this, finding alternatives to speed up the execution time of those algorithms is an active topic of research. Previous attempts of acceleration using custom architectures have been limited because of the nature of the algorithms that have been conceived sequentially and do not exploit the intrinsic parallelism that the hardware provides. The innovation in this paper is a highly parallel algorithm that utilizes a vertical bit vector (VBV) data layout and its feasibility for making support counting. Our results show that for dense databases a custom architecture for this algorithm can perform faster than the fastest architecture reported in previous works by one order of magnitude.