A highly parallel algorithm for frequent itemset mining

  • Authors:
  • Alejandro Mesa;Claudia Feregrino-Uribe;René Cumplido;José Hernández-Palancar

  • Affiliations:
  • Advanced Technologies Application Center, La Habana, Cuba and National Institute for Astrophysics, Optics and Electronics, Puebla, México;National Institute for Astrophysics, Optics and Electronics, Puebla, México;National Institute for Astrophysics, Optics and Electronics, Puebla, México;Advanced Technologies Application Center, La Habana, Cuba

  • Venue:
  • MCPR'10 Proceedings of the 2nd Mexican conference on Pattern recognition: Advances in pattern recognition
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Mining frequent itemsets in large databases is a widely used technique in Data Mining. Several sequential and parallel algorithms have been developed, although, when dealing with high data volumes, the execution of those algorithms takes more time and resources than expected. Because of this, finding alternatives to speed up the execution time of those algorithms is an active topic of research. Previous attempts of acceleration using custom architectures have been limited because of the nature of the algorithms that have been conceived sequentially and do not exploit the intrinsic parallelism that the hardware provides. The innovation in this paper is a highly parallel algorithm that utilizes a vertical bit vector (VBV) data layout and its feasibility for making support counting. Our results show that for dense databases a custom architecture for this algorithm can perform faster than the fastest architecture reported in previous works by one order of magnitude.