Interactive presentation: An FPGA implementation of decision tree classification

Authors:
Ramanathan Narayanan;Daniel Honbo;Gokhan Memik;Alok Choudhary;Joseph Zambreno
Affiliations:
Northwestern University, Evanston, IL;Northwestern University, Evanston, IL;Northwestern University, Evanston, IL;Northwestern University, Evanston, IL;Iowa State University, Ames, IA
Venue:
Proceedings of the conference on Design, automation and test in Europe
Year:
2007

Citing 6
Cited 1

Fast discovery of association rules

Advances in knowledge discovery and data mining
Data mining: concepts and techniques

Data mining: concepts and techniques
SPRINT: A Scalable Parallel Classifier for Data Mining

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
ScalParC: A New Scalable and Efficient Parallel Classification Algorithm for Mining Large Datasets

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Efficient Hardware Data Mining with the Apriori Algorithm on FPGAs

FCCM '05 Proceedings of the 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
An Architecture for Efficient Hardware Data Mining using Reconfigurable Computing Systems

FCCM '06 Proceedings of the 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines

HC-CART: A parallel system implementation of data mining classification and regression tree (CART) algorithm on a multi-FPGA system

ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data mining techniques are a rapidly emerging class of applications that have widespread use in several fields. One important problem in data mining is Classification, which is the task of assigning objects to one of several predefined categories. Among the several solutions developed, Decision Tree Classification (DTC) is a popular method that yields high accuracy while handling large datasets. However, DTC is a computationally intensive algorithm, and as data sizes increase, its running time can stretch to several hours. In this paper, we propose a hardware implementation of Decision Tree Classification. We identify the compute-intensive kernel (Gini Score computation) in the algorithm, and develop a highly efficient architecture, which is further optimized by reordering the computations and by using a bitmapped data structure. Our implementation on a Xilinx Virtex-II Pro FPGA platform (with 16 Gini units) provides up to 5.58x performance improvement over an equivalent software implementation.