A fuzzy threshold based modified clustering algorithm for natural data exploration

  • Authors:
  • Binu Thomas;G. Raju

  • Affiliations:
  • Dept. of Computer Applications, Marian College, Kuttiikkanam, Kerala, India;Department of Information Technoogy, Kannur University, Kannur, Kerala, India

  • Venue:
  • PAISI'10 Proceedings of the 2010 Pacific Asia conference on Intelligence and Security Informatics
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Traditional supervised clustering methods require the user to provide the number of clusters before we start any data exploration. The data engineer also has to select the initial cluster seeds. In c-means clustering method, the performance efficiency of the algorithm depends mainly on the initial selection of number of clusters and cluster seeds. With the real world data, the initial selection of cluster count and centroids becomes a tedious task. In this paper we propose a modified clustering algorithm which works on the principles of fuzzy clustering. The method we propose is using a modified form of popular fuzzy c-means algorithm for membership calculation. The algorithm begins on the assumption that all the data points are initial centroids. . The clusters are continuously merged based on a threshold value until we get the optimum number of clusters. The algorithm is also capable of detecting the outliers The algorithm is tested with the data for Gross National Happiness (GNH) program of Bhutan and found to be highly efficient in segmenting natural data sets.