Trading MIPS and memory for knowledge engineering
Communications of the ACM
ScalParC: A New Scalable and Efficient Parallel Classification Algorithm for Mining Large Datasets
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
IEEE Transactions on Knowledge and Data Engineering
The potential of the cell processor for scientific computing
Proceedings of the 3rd conference on Computing frontiers
Vectorized data processing on the cell broadband engine
DaMoN '07 Proceedings of the 3rd international workshop on Data management on new hardware
A Performance Study of Secure Data Mining on the Cell Processor
CCGRID '08 Proceedings of the 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid
Data mining on the cell broadband engine
Proceedings of the 22nd annual international conference on Supercomputing
Data Mining Algorithms on the Cell Broadband Engine
Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
Paper: Nearest neighbor classification on two types of SIMD machines
Parallel Computing
VECPAR'06 Proceedings of the 7th international conference on High performance computing for computational science
Parallel nearest neighbour algorithms for text categorization
Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Hi-index | 0.00 |
We present implementations of two data-mining algorithms on a CELL processor, and on a low-cost CBEA (CELL Broadband Engine Architecture) cluster using multiple PlayStation3 consoles. Typical batch-processing environments are often unsuitable for interactive data-mining processes that require repeated adjustments to parameters, pre-processing steps, and data, while contemporary desktops do not offer sufficient resources for the large datasets available today. Our implementations for the k Nearest Neighbour algorithm and the Decision Tree scale linearly with the number of samples in the training data and the number of processors, and demonstrate runtimes of under a minute for up to 500 000 samples.