Developing a feature weight self-adjustment mechanism for a K-means clustering algorithm

Authors:
Chieh-Yuan Tsai;Chuang-Cheng Chiu
Affiliations:
Department of Industrial Engineering and Management, Yuan Ze University, 135 Yuantung Road, Chungli City, Taoyuan County 320, Taiwan;Department of Industrial Engineering and Management, Yuan Ze University, 135 Yuantung Road, Chungli City, Taoyuan County 320, Taiwan
Venue:
Computational Statistics & Data Analysis
Year:
2008

Citing 19
Cited 17

Original Contribution: Training a 3-node neural network is NP-complete

Neural Networks
A Review and Empirical Evaluation of Feature Weighting Methods for aClass of Lazy Learning Algorithms

Artificial Intelligence Review - Special issue on lazy learning
Feature selection in unsupervised learning via evolutionary search

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Unsupervised Feature Selection Using Feature Similarity

IEEE Transactions on Pattern Analysis and Machine Intelligence
Performance Evaluation of Some Clustering Algorithms and Validity Indices

IEEE Transactions on Pattern Analysis and Machine Intelligence
Efficient Feature Selection in Conceptual Clustering

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Feature Subset Selection and Order Identification for Unsupervised Learning

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
X-means: Extending K-means with Efficient Estimation of the Number of Clusters

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Feature Weighting in k-Means Clustering

Machine Learning
Feature Selection for Clustering - A Filter Solution

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Simultaneous Feature Selection and Clustering Using Mixture Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
Toward Integrating Feature Selection Algorithms for Classification and Clustering

IEEE Transactions on Knowledge and Data Engineering
Automated Variable Weighting in k-Means Type Clustering

IEEE Transactions on Pattern Analysis and Machine Intelligence
General C-Means Clustering Model

IEEE Transactions on Pattern Analysis and Machine Intelligence
Introduction to Data Mining, (First Edition)

Introduction to Data Mining, (First Edition)
Locally adaptive metrics for clustering high dimensional data

Data Mining and Knowledge Discovery
An Entropy Weighting k-Means Algorithm for Subspace Clustering of High-Dimensional Sparse Data

IEEE Transactions on Knowledge and Data Engineering
Cluster analysis using multivariate normal mixture models to detect differential gene expression with microarray data

Computational Statistics & Data Analysis
Short communication: Optimising k-means clustering results with standard software packages

Computational Statistics & Data Analysis

Graph clustering based on structural/attribute similarities

Proceedings of the VLDB Endowment
From variable weighting to cluster characterization in topographic unsupervised learning

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Editorial: New fuzzy c-means clustering model based on the data weighted approach

Data & Knowledge Engineering
A kind of generalized fuzzy C-means clustering model and its applications in mining steel strip flatness signal

WSEAS Transactions on Information Science and Applications
Clustering Large Attributed Graphs: A Balance between Structural and Attribute Similarities

ACM Transactions on Knowledge Discovery from Data (TKDD)
Document clustering using synthetic cluster prototypes

Data & Knowledge Engineering
A review on particle swarm optimization algorithms and their applications to data clustering

Artificial Intelligence Review
An entropy weighting mixture model for subspace clustering of high-dimensional data

Pattern Recognition Letters
A feature group weighting method for subspace clustering of high-dimensional data

Pattern Recognition
Minkowski metric, feature weighting and anomalous cluster initializing in K-Means clustering

Pattern Recognition
Simultaneous pattern and variable weighting during topological clustering

ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part I
Community detection in incomplete information networks

Proceedings of the 21st international conference on World Wide Web
Weighting features for partition around medoids using the minkowski metric

IDA'12 Proceedings of the 11th international conference on Advances in Intelligent Data Analysis
Novel soft subspace clustering with multi-objective evolutionary approach for high-dimensional data

Pattern Recognition
Dynamic clustering of histogram data based on adaptive squared Wasserstein distances

Expert Systems with Applications: An International Journal
Mutual information evaluation: A way to predict the performance of feature weighting on clustering

Intelligent Data Analysis
Robust local feature weighting hard c-means clustering algorithm

Neurocomputing

Quantified Score

Hi-index	0.03

Visualization

Abstract

K-means is one of the most popular and widespread partitioning clustering algorithms due to its superior scalability and efficiency. Typically, the K-means algorithm treats all features fairly and sets weights of all features equally when evaluating dissimilarity. However, a meaningful clustering phenomenon often occurs in a subspace defined by a specific subset of all features. To address this issue, this paper proposes a novel feature weight self-adjustment (FWSA) mechanism embedded into K-means in order to improve the clustering quality of K-means. In the FWSA mechanism, finding feature weights is modeled as an optimization problem to simultaneously minimize the separations within clusters and maximize the separations between clusters. With this objective, the adjustment margin of a feature weight can be derived based on the importance of the feature to the clustering quality. At each iteration in K-means, all feature weights are adaptively updated by adding their respective adjustment margins. A number of synthetic and real data are experimented on to show the benefits of the proposed FWAS mechanism. In addition, when compared to a recent similar feature weighting work, the proposed mechanism illustrates several advantages in both the theoretical and experimental results.