High dimensional data clustering through fuzzy possibilistic C-means with symmetry-based distance measure

  • Authors:
  • B. Shanmugapriya;M. Punithavalli

  • Affiliations:
  • Department of Computer Science, 395, Sarojini Naidu Road, New Siddhapudur, Coimbatore-641044, India;Department of Computer Applications, Vattamalaipalayam, N.G.G.O. Colony P.O, Coimbatore-641022, India

  • Venue:
  • International Journal of Computational Intelligence Studies
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

One of the difficult tasks in data clustering is clustering the high dimensional data. Clustering high dimensional data has been a major concern owing to the intrinsic sparsity of the data points. Several recent research results signifies that in case of high dimensional data, even the notion of proximity or clustering possibly will not be significant. Fuzzy C-means FCM and possibilistic C-means PCM has the capability to handle the high dimensional data, whereas FCM is sensitive to noise and PCM requires appropriate initialisation to converge to nearly global minimum. Hence to overcome this issue a fuzzy possibilistic C-means FPCM with symmetry-based distance measure has been proposed which can find out the number of clusters that exist in a dataset. In addition with a good fuzzy partitioning of the data, a novel fuzzy cluster validity index called FSym-index is used which depends on the symmetry-based distance. Symmetry-based distance provides a measure of integrity of clustering on several fuzzy partitions of a dataset. If the value of FSym-index is larger, the accuracy also becomes high with less execution time.