GPUMAFIA: efficient subspace clustering with MAFIA on GPUs

  • Authors:
  • Andrew Adinetz;Jiri Kraus;Jan Meinke;Dirk Pleiter

  • Affiliations:
  • JSC, Forschungszentrum Jülich, Jülich, Germany,Research Computing Center, Lomonosov Moscow State University, Russia;NVIDIA GmbH, Germany;JSC, Forschungszentrum Jülich, Jülich, Germany;JSC, Forschungszentrum Jülich, Jülich, Germany

  • Venue:
  • Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Clustering, i.e., the identification of regions of similar objects in a multi-dimensional data set, is a standard method of data analytics with a large variety of applications. For high-dimensional data, subspace clustering can be used to find clusters among a certain subset of data point dimensions and alleviate the curse of dimensionality. In this paper we focus on the MAFIA subspace clustering algorithm and on using GPUs to accelerate the algorithm. We first present a number of algorithmic changes and estimate their effect on computational complexity of the algorithm. These changes improve the computational complexity of the algorithm and accelerate the sequential version by 1---2 orders of magnitude on practical datasets while providing exactly the same output. We then present the GPU version of the algorithm, which for typical datasets provides a further 1---2 orders of magnitude speedup over a single CPU core or about an order of magnitude over a typical multi-core CPU. We believe that our faster implementation widens the applicability of MAFIA and subspace clustering.