Parallel WaveCluster: A linear scaling parallel clustering algorithm implementation with application to very large datasets

  • Authors:
  • Ahmet Artu Yıldırım;Cem Özdoğan

  • Affiliations:
  • -;-

  • Venue:
  • Journal of Parallel and Distributed Computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

A linear scaling parallel clustering algorithm implementation and its application to very large datasets for cluster analysis is reported. WaveCluster is a novel clustering approach based on wavelet transforms. Despite this approach has an ability to detect clusters of arbitrary shapes in an efficient way, it requires considerable amount of time to collect results for large sizes of multi-dimensional datasets. We propose the parallel implementation of the WaveCluster algorithm based on the message passing model for a distributed-memory multiprocessor system. In the proposed method, communication among processors and memory requirements are kept at minimum to achieve high efficiency. We have conducted the experiments on a dense dataset and a sparse dataset to measure the algorithm behavior appropriately. Our results obtained from performed experiments demonstrate that developed parallel WaveCluster algorithm exposes high speedup and scales linearly with the increasing number of processors.