G-means improved for cell BE environment

Authors:
Aislan G. Foina;Rosa M. Badia;Javier Ramirez-Fernandez
Affiliations:
Universidade de São Paulo, São Paulo, Brazil and Barcelona Supercomputing Center and Artificial Intelligence Research Institute, Spanish National Research Council, Barcelona, Spain;Barcelona Supercomputing Center and Artificial Intelligence Research Institute, Spanish National Research Council, Barcelona, Spain;Universidade de São Paulo, São Paulo, Brazil
Venue:
Facing the multicore-challenge
Year:
2010

Citing 4
Cited 0

X-means: Extending K-means with Efficient Estimation of the Number of Clusters

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
CellSs: making it easier to program the cell broadband engine processor

IBM Journal of Research and Development
Data mining on the cell broadband engine

Proceedings of the 22nd annual international conference on Supercomputing
K-Means on Commodity GPUs with CUDA

CSIE '09 Proceedings of the 2009 WRI World Congress on Computer Science and Information Engineering - Volume 03

Quantified Score

Hi-index	0.00

Visualization

Abstract

The performance gain obtained by the adaptation of the G-means algorithm for a Cell BE environment using the CellSs framework is described. G-means is a clustering algorithm based on k-means, used to find the number of Gaussian distributions and their centers inside a multi-dimensional dataset. It is normally used for data mining applications, and its execution can be divided into 6 execution steps. This paper analyzes each step to select which of them could be improved. In the implementation, the algorithm was modified to use the specific SIMD instructions of the Cell processor and to introduce parallel computing using the CellSs framework to handle the SPU tasks. The hardware used was an IBM BladeCenter QS22 containing two PowerXCell processors. The results show the execution of the algorithm 60% faster as compared with the non-improved code.