A new method for GPU based irregular reductions and its application to k-means clustering

Authors:
Balaji Dhanasekaran;Norman Rubin
Affiliations:
Advanced Micro Devices, Inc.;Advanced Micro Devices, Inc.
Venue:
Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units
Year:
2011

Citing 10
Cited 2

Vector quantization and signal compression

Vector quantization and signal compression
Advances in knowledge discovery and data mining

Advances in knowledge discovery and data mining
Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator

ACM Transactions on Modeling and Computer Simulation (TOMACS) - Special issue on uniform random number generation
Efficient histogram generation using scattering on GPUs

Proceedings of the 2007 symposium on Interactive 3D graphics and games
Modified global k-means algorithm for clustering in gene expression data sets

WISB '06 Proceedings of the 2006 workshop on Intelligent systems for bioinformatics - Volume 73
A performance study of general-purpose applications on graphics processors using CUDA

Journal of Parallel and Distributed Computing
Intel threading building blocks

Intel threading building blocks
Clustering billions of data points using GPUs

Proceedings of the combined workshops on UnConventional high performance computing workshop plus memory access workshop
K-Means on Commodity GPUs with CUDA

CSIE '09 Proceedings of the 2009 WRI World Congress on Computer Science and Information Engineering - Volume 03
Compiler and runtime support for enabling generalized reduction computations on heterogeneous parallel configurations

Proceedings of the 24th ACM International Conference on Supercomputing

Nested data-parallelism on the gpu

Proceedings of the 17th ACM SIGPLAN international conference on Functional programming
Iterative statistical kernels on contemporary GPUs

International Journal of Computational Science and Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

A frequently used method of clustering is a technique called k-means clustering. The k-means algorithm consists of two steps: A map step, which is simple to execute on a GPU, and a reduce step, which is more problematic. Previous researchers have used a hybrid approach in which the map step is computed on the GPU and the reduce step is performed on the CPU. In this work, we present a new algorithm for irregular reductions and apply it to k-means such that the GPU executes both the map and reduce steps. We provide experimental comparisons using OpenCL. Our results show that our scheme is 3.2 times faster than the hybrid scheme for k = 10, an average 1.5 times faster when the number of clusters, k = 100 and on average equal for k = 400, on an ATI Radeon® HD 5870 (best speedup was 3.5 times) compared to the hybrid approach. In addition, we compare the GPU code with the standard OpenMP benchmark, MineBench. In that implementation, both the map and reduce steps are computed on the CPU. For large data sizes, the new GPU scheme shows great promise, with performance up to 35 times faster than MineBench on a four core Intel i7 CPU.