X-means: Extending K-means with Efficient Estimation of the Number of Clusters
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
CellSs: making it easier to program the cell broadband engine processor
IBM Journal of Research and Development
Data mining on the cell broadband engine
Proceedings of the 22nd annual international conference on Supercomputing
K-Means on Commodity GPUs with CUDA
CSIE '09 Proceedings of the 2009 WRI World Congress on Computer Science and Information Engineering - Volume 03
Hi-index | 0.00 |
The performance gain obtained by the adaptation of the G-means algorithm for a Cell BE environment using the CellSs framework is described. G-means is a clustering algorithm based on k-means, used to find the number of Gaussian distributions and their centers inside a multi-dimensional dataset. It is normally used for data mining applications, and its execution can be divided into 6 execution steps. This paper analyzes each step to select which of them could be improved. In the implementation, the algorithm was modified to use the specific SIMD instructions of the Cell processor and to introduce parallel computing using the CellSs framework to handle the SPU tasks. The hardware used was an IBM BladeCenter QS22 containing two PowerXCell processors. The results show the execution of the algorithm 60% faster as compared with the non-improved code.