An evolutionary technique based on K-means algorithm for optimal clustering in RN
Information Sciences—Applications: An International Journal
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Information Sciences: an International Journal
A new approach for estimating null value in relational database
Soft Computing - A Fusion of Foundations, Methodologies and Applications
IEEE Transactions on Fuzzy Systems
Hi-index | 0.00 |
Incomplete and noisy data significantly distort data mining results. Therefore, taking care of missing values or noisy data becomes extremely crucial in data mining. Recent researches start to exploit data clustering techniques to estimate missing values. Obviously the quality of clustering analysis significantly influences the performance of missing data estimation. It was proven that clustering problem is NP-hard. Particle swarm optimization (PSO) is the recently suggested heuristic search process for solving data clustering problems. In this paper, a compounded PSO (CPSO) clustering approach is proposed for the missing value estimation. Normalization methods are first utilized to filter outliers and prevent some attributes from dominating the clustering result. Then the K-means algorithm and reflex mechanism are combined with the standard PSO clustering so that it can quickly converge to a reasonable good solution. Meanwhile, an iteration-based filling-in value scheme is utilized to guide the searching of CPSO clustering for the optimal estimate values. Effectiveness of the proposed approach is demonstrated on some data sets for four different rates of missing data. The empirical evaluation shows the superiority of CPSO over the well known K-means, PSO, and SOM-based approaches, and it is desirable for solving missing value problems.