C4.5: programs for machine learning
C4.5: programs for machine learning
Algorithmic transformations in the implementation of K- means clustering on reconfigurable hardware
FPGA '01 Proceedings of the 2001 ACM/SIGDA ninth international symposium on Field programmable gate arrays
SLIQ: A Fast Scalable Classifier for Data Mining
EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Parallel Implementation of Decision Tree Learning Algorithms
EPIA '01 Proceedings of the10th Portuguese Conference on Artificial Intelligence on Progress in Artificial Intelligence, Knowledge Extraction, Multi-agent Systems, Logic Programming and Constraint Solving
Parallel Formulations of Decision-Tree Classification Algorithms
ICPP '98 Proceedings of the 1998 International Conference on Parallel Processing
SPRINT: A Scalable Parallel Classifier for Data Mining
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
ScalParC: A New Scalable and Efficient Parallel Classification Algorithm for Mining Large Datasets
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Efficient Hardware Data Mining with the Apriori Algorithm on FPGAs
FCCM '05 Proceedings of the 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
YALE: rapid prototyping for complex data mining tasks
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
An Architecture for Efficient Hardware Data Mining using Reconfigurable Computing Systems
FCCM '06 Proceedings of the 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Interactive presentation: An FPGA implementation of decision tree classification
Proceedings of the conference on Design, automation and test in Europe
K-means Clustering for Multispectral Images Using Floating-Point Divide
FCCM '07 Proceedings of the 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Top 10 algorithms in data mining
Knowledge and Information Systems
Acceleration of decision tree searching for IP traffic classification
Proceedings of the 4th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
A Reconfigurable Platform for Frequent Pattern Mining
RECONFIG '08 Proceedings of the 2008 International Conference on Reconfigurable Computing and FPGAs
A translation system for enabling data mining applications on GPUs
Proceedings of the 23rd international conference on Supercomputing
Frequent itemset mining on graphics processors
Proceedings of the Fifth International Workshop on Data Management on New Hardware
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
PLANET: massively parallel learning of tree ensembles with MapReduce
Proceedings of the VLDB Endowment
FPMR: MapReduce framework on FPGA
Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays
A Streaming Parallel Decision Tree Algorithm
The Journal of Machine Learning Research
The high-activity parallel implementation of data preprocessing based on MapReduce
RSKT'10 Proceedings of the 5th international conference on Rough set and knowledge technology
Parallel boosted regression trees for web search ranking
Proceedings of the 20th international conference on World wide web
Novel and Highly Efficient Reconfigurable Implementation of Data Mining Classification Tree
FPL '11 Proceedings of the 2011 21st International Conference on Field Programmable Logic and Applications
Scaling up Machine Learning: Parallel and Distributed Approaches
Scaling up Machine Learning: Parallel and Distributed Approaches
Scalable regression tree learning on Hadoop using OpenPlanet
Proceedings of third international workshop on MapReduce and its Applications Date
Hi-index | 0.00 |
Data mining is a new field of computer science with a wide range of applications. Its goal is to extract knowledge from massive datasets in a human-understandable structure, for example, the decision trees. In this article we present an innovative, high-performance, system-level architecture for the Classification And Regression Tree (CART) algorithm, one of the most important and widely used algorithms in the data mining area. Our proposed architecture exploits parallelism at the decision variable level, and was fully implemented and evaluated on a modern high-performance reconfigurable platform, the Convey HC-1 server, that features four FPGAs and a multicore processor. Our FPGA-based implementation was integrated with the widely used “rpart” software library of the R project in order to provide the first fully functional reconfigurable system that can handle real-world large databases. The proposed system, named HC-CART system, achieves a performance speedup of up to two orders of magnitude compared to well-known single-threaded data mining software platforms, such as WEKA and the R platform. It also outperforms similar hardware systems which implement parts of the complete application by an order of magnitude. Finally, we show that the HC-CART system offers higher performance speedup than some other proposed parallel software implementations of decision tree construction algorithms.