High dimensional data clustering through fuzzy possibilistic C-means with symmetry-based distance measure

Authors:
B. Shanmugapriya;M. Punithavalli
Affiliations:
Department of Computer Science, 395, Sarojini Naidu Road, New Siddhapudur, Coimbatore-641044, India;Department of Computer Applications, Vattamalaipalayam, N.G.G.O. Colony P.O, Coimbatore-641022, India
Venue:
International Journal of Computational Intelligence Studies
Year:
2013

Citing 9
Cited 0

Algorithms for clustering data

Algorithms for clustering data
Fuzzy Models and Algorithms for Pattern Recognition and Image Processing

Fuzzy Models and Algorithms for Pattern Recognition and Image Processing
Intelligent Data Analysis: An Introduction

Intelligent Data Analysis: An Introduction
Survey of Text Mining

Survey of Text Mining
A cluster validity index for fuzzy clustering

Pattern Recognition Letters
Possibilistic Fuzzy c-Means Clustering Model Using Kernel Methods

CIMCA '05 Proceedings of the International Conference on Computational Intelligence for Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce Vol-2 (CIMCA-IAWTIC'06) - Volume 02
A Fuzzy Genetic Clustering Technique Using a New Symmetry Based Distance for Automatic Evolution of Clusters

ICCTA '07 Proceedings of the International Conference on Computing: Theory and Applications
Hyperspherical possibilistic fuzzy c-means for high-dimensional data clustering

ICICS'09 Proceedings of the 7th international conference on Information, communications and signal processing
The possibilistic C-means algorithm: insights and recommendations

IEEE Transactions on Fuzzy Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

One of the difficult tasks in data clustering is clustering the high dimensional data. Clustering high dimensional data has been a major concern owing to the intrinsic sparsity of the data points. Several recent research results signifies that in case of high dimensional data, even the notion of proximity or clustering possibly will not be significant. Fuzzy C-means FCM and possibilistic C-means PCM has the capability to handle the high dimensional data, whereas FCM is sensitive to noise and PCM requires appropriate initialisation to converge to nearly global minimum. Hence to overcome this issue a fuzzy possibilistic C-means FPCM with symmetry-based distance measure has been proposed which can find out the number of clusters that exist in a dataset. In addition with a good fuzzy partitioning of the data, a novel fuzzy cluster validity index called FSym-index is used which depends on the symmetry-based distance. Symmetry-based distance provides a measure of integrity of clustering on several fuzzy partitions of a dataset. If the value of FSym-index is larger, the accuracy also becomes high with less execution time.