A fuzzy c-means clustering algorithm based on nearest-neighbor intervals for incomplete data

Authors:
Dan Li;Hong Gu;Liyong Zhang
Affiliations:
School of Electronic and Information Engineering, Dalian University of Technology, Dalian 116024, China;School of Electronic and Information Engineering, Dalian University of Technology, Dalian 116024, China;School of Electronic and Information Engineering, Dalian University of Technology, Dalian 116024, China
Venue:
Expert Systems with Applications: An International Journal
Year:
2010

Citing 6
Cited 3

A pseudo-nearest-neighbor approach for missing data recovery on Gaussian random data sets

Pattern Recognition Letters
Clustering incomplete relational data using the non-Euclidean relational fuzzy c-means algorithm

Pattern Recognition Letters
A novel gray-based reduced NN classification method

Pattern Recognition
Fuzzy c-means clustering of incomplete data

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Linear fuzzy clustering techniques with missing values and their application to local principal component analysis

IEEE Transactions on Fuzzy Systems
Optimization of clustering criteria by reformulation

IEEE Transactions on Fuzzy Systems

A tree-based-trend-diffusion prediction procedure for small sample sets in the early stages of manufacturing systems

Expert Systems with Applications: An International Journal
A hybrid method for imputation of missing values using optimized fuzzy c-means with support vector regression and a genetic algorithm

Information Sciences: an International Journal
Missing data analyses: a hybrid multiple imputation algorithm using Gray System Theory and entropy based on clustering

Applied Intelligence

Quantified Score

Hi-index	12.05

Visualization

Abstract

Partially missing data sets are a prevailing problem in clustering analysis. In this paper, missing attributes are represented as intervals, and a novel fuzzy c-means algorithm for incomplete data based on nearest-neighbor intervals is proposed. The algorithm estimates the nearest-neighbor interval representation of missing attributes by using the attribute distribution information of the data sets sufficiently, which can enhances the robustness of missing attribute imputation compared with other numerical imputation methods. Also, the convex hyper-polyhedrons formed by interval prototypes can present the uncertainty of missing attributes, and simultaneously reflect the shape of the clusters to some degree, which is helpful in enhancing the robustness of clustering analysis. Comparisons and analysis of the experimental results for several UCI data sets demonstrate the capability of the proposed algorithm.