Parallel Fuzzy c-Means Clustering for Large Data Sets

  • Authors:
  • Terence Kwok;Kate A. Smith;Sebastián Lozano;David Taniar

  • Affiliations:
  • -;-;-;-

  • Venue:
  • Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

The parallel fuzzy c-means (PFCM) algorithm for clustering large data sets is proposed in this paper. The proposed algorithm is designed to run on parallel computers of the Single Program Multiple Data (SPMD) model type with the Message Passing Interface (MPI). A comparison is made between PFCM and an existing parallel k-means (PKM) algorithm in terms of their parallelisation capability and scalability. In an implementation of PFCM to cluster a large data set from an insurance company, the proposed algorithm is demonstrated to have almost ideal speedups as well as an excellent scaleup with respect to the size of the data sets.