Interactive data mining on a CBEA cluster

  • Authors:
  • Sabine McConnell;David Patton;Richard Hurley;Wilfred Blight;Graeme Young

  • Affiliations:
  • Department of Computing and Information Systems, Trent University, Peterborough, ON, Canada;Department of Physics and Astronomy, Trent University, Peterborough, ON, Canada;Department of Computing and Information Systems, Trent University, Peterborough, ON, Canada;Department of Computing and Information Systems, Trent University, Peterborough, ON, Canada;Department of Computing and Information Systems, Trent University, Peterborough, ON, Canada

  • Venue:
  • HPCS'09 Proceedings of the 23rd international conference on High Performance Computing Systems and Applications
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present implementations of two data-mining algorithms on a CELL processor, and on a low-cost CBEA (CELL Broadband Engine Architecture) cluster using multiple PlayStation3 consoles. Typical batch-processing environments are often unsuitable for interactive data-mining processes that require repeated adjustments to parameters, pre-processing steps, and data, while contemporary desktops do not offer sufficient resources for the large datasets available today. Our implementations for the k Nearest Neighbour algorithm and the Decision Tree scale linearly with the number of samples in the training data and the number of processors, and demonstrate runtimes of under a minute for up to 500 000 samples.