Clustering large data with uncertainty

Authors:
Sampreeti Ghosh;Sushmita Mitra
Affiliations:
Machine Intelligence Unit, Indian Statistical Institute, Kolkata 700 108, India;Machine Intelligence Unit, Indian Statistical Institute, Kolkata 700 108, India
Venue:
Applied Soft Computing
Year:
2013

Citing 9
Cited 1

Algorithms for clustering data

Algorithms for clustering data
A Validity Measure for Fuzzy Clustering

IEEE Transactions on Pattern Analysis and Machine Intelligence
Fuzzy logic, neural networks, and soft computing

Communications of the ACM
Pattern Recognition with Fuzzy Objective Function Algorithms

Pattern Recognition with Fuzzy Objective Function Algorithms
Neuro-Fuzzy Pattern Recognition: Methods in Soft Computing

Neuro-Fuzzy Pattern Recognition: Methods in Soft Computing
CLARANS: A Method for Clustering Objects for Spatial Data Mining

IEEE Transactions on Knowledge and Data Engineering
A greedy randomized adaptive search procedure applied to the clustering problem as an initialization process using K-Means as a local search procedure

Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology - IBERAMIA '02
Local vs global interactions in clustering algorithms: Advances over K-means

International Journal of Knowledge-based and Intelligent Engineering Systems
Low-complexity fuzzy relational clustering algorithms for Web mining

IEEE Transactions on Fuzzy Systems

Fuzzy clustering with biological knowledge for gene selection

Applied Soft Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

A new algorithm is designed for handling fuzziness while mining large data. A new novel cost function weighted by fuzzy membership, is proposed in the framework of CLARANS. A new scalable approximation to the maximum number of neighbors, explored at each node, is developed; thus reducing the computational time for large data while eliminating the need for user-defined (heuristic) parameters in the existing equation. The goodness of the generated clusters is evaluated in terms of Xie-Beni validity index. Results demonstrate the superiority of the proposed algorithm, over both synthetic and real data sets, in terms of goodness of clustering. It is interesting to note that our algorithm always converges to the globally best values at the optimal number of partitions. Moreover compared to existing fuzzy algorithms, FCLARANS without scanning the whole dataset, searching small number of neighbors, is able to handle the uncertainty due to overlapping nature of the various partitions. This is the main motivation of fuzzification of the algorithm CLARANS.