An FPGA-Based Accelerator for Frequent Itemset Mining

Authors:
Yan Zhang;Fan Zhang;Zheming Jin;Jason D. Bakos
Affiliations:
University of South Carolina;University of South Carolina;University of South Carolina;University of South Carolina
Venue:
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Year:
2013

Citing 17
Cited 0

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Using a Hash-Based Method with Transaction Trimming for Mining Association Rules

IEEE Transactions on Knowledge and Data Engineering
Scalable Algorithms for Association Mining

IEEE Transactions on Knowledge and Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Efficient Hardware Data Mining with the Apriori Algorithm on FPGAs

FCCM '05 Proceedings of the 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Designing signal processing systems for FPGAs

Proceedings of the conference on Design, automation and test in Europe: Proceedings
An Architecture for Efficient Hardware Data Mining using Reconfigurable Computing Systems

FCCM '06 Proceedings of the 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Hardware-Enhanced Association Rule Mining with Hashing and Pipelining

IEEE Transactions on Knowledge and Data Engineering
Scalable Hybrid Designs for Linear Algebra on Reconfigurable Computing Systems

IEEE Transactions on Computers
A Reconfigurable Platform for Frequent Pattern Mining

RECONFIG '08 Proceedings of the 2008 International Conference on Reconfigurable Computing and FPGAs
Accelerating Phylogeny-Aware Short DNA Read Alignment with FPGAs

FCCM '11 Proceedings of the 2011 IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines
Design and Analysis of a Reconfigurable Platform for Frequent Pattern Mining

IEEE Transactions on Parallel and Distributed Systems
Frequent Itemset Mining on Large-Scale Shared Memory Machines

CLUSTER '11 Proceedings of the 2011 IEEE International Conference on Cluster Computing
Finding itemset-sharing patterns in a large itemset-associated graph

PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Trie: An alternative data structure for data mining algorithms

Mathematical and Computer Modelling: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this article we describe a Field Programmable Gate Array (FPGA)-based coprocessor architecture for Frequent Itemset Mining (FIM). FIM is a common data mining task used to find frequently occurring subsets amongst a database of sets. FIM is a nonnumerical, data intensive computation and is used in machine learning and computational biology. FIM is particularly expensive---in terms of execution time and memory---when performed on large and/or sparse databases or when applied using a low appearance frequency threshold. Because of this, the development of increasingly efficient FIM algorithms and their mapping to parallel architectures is an active field. Previous attempts to accelerate FIM using FPGAs have relied on performance-limiting strategies such as iterative database loading and runtime logic unit reconfiguration. In this article, we present a novel architecture to implement Eclat, a well-known FIM algorithm. Unlike previous efforts, our technique does not impose limits on the maximum set size as a function of available FPGA logic resources and our design scales well to multiple FPGAs. In addition to a novel hardware design, we also present a corresponding compression scheme for intermediate results that are stored in on-chip memory. On a four-FPGA board, experimental results show up to 68X speedup compared to a highly optimized software implementation.