Automated Variable Weighting in k-Means Type Clustering

Authors:
Joshua Zhexue Huang;Michael K. Ng;Hongqiang Rong;Zichen Li
Affiliations:
-;-;-;-
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
2005

Citing 5
Cited 74

Algorithms for clustering data

Algorithms for clustering data
Automatic subspace clustering of high dimensional data for data mining applications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values

Data Mining and Knowledge Discovery
Feature Weighting in k-Means Clustering

Machine Learning
A fuzzy k-modes algorithm for clustering categorical data

IEEE Transactions on Fuzzy Systems

Hierarchical land cover information retrieval in object-oriented remote sensing image databases with native queries

ACM-SE 45 Proceedings of the 45th annual southeast regional conference
An Entropy Weighting k-Means Algorithm for Subspace Clustering of High-Dimensional Sparse Data

IEEE Transactions on Knowledge and Data Engineering
A k-mean clustering algorithm for mixed numeric and categorical data

Data & Knowledge Engineering
Comparison between two coevolutionary feature weighting algorithms in clustering

Pattern Recognition
A Redundancy-Based Measure of Dissimilarity among Probability Distributions for Hierarchical Clustering Criteria

IEEE Transactions on Pattern Analysis and Machine Intelligence
Exploitation of a parallel clustering algorithm on commodity hardware with P2P-MPI

The Journal of Supercomputing
A general grid-clustering approach

Pattern Recognition Letters
Developing a feature weight self-adjustment mechanism for a K-means clustering algorithm

Computational Statistics & Data Analysis
Feature Weighted Rival Penalized EM for Gaussian Mixture Clustering: Automatic Feature and Model Selections in a Single Paradigm

Computational Intelligence and Security
Ordering Grids to Identify the Clustering Structure

ISNN '07 Proceedings of the 4th international symposium on Neural Networks: Part II--Advances in Neural Networks
Subspace Vector Quantization and Markov Modeling for Cell Phase Classification

ICIAR '08 Proceedings of the 5th international conference on Image Analysis and Recognition
An unsupervised method of classifying remotely sensed images using Kohonen self-organizing maps and agglomerative hierarchical clustering methods

International Journal of Remote Sensing
Building a Decision Cluster Classification Model for High Dimensional Data by a Variable Weighting k-Means Method

AI '08 Proceedings of the 21st Australasian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering

ACM Transactions on Knowledge Discovery from Data (TKDD)
Clustering of document collection - A weighting approach

Expert Systems with Applications: An International Journal
Multi-start Stochastic Competitive Hopfield Neural Network for p-Median Problem

ISNN '09 Proceedings of the 6th International Symposium on Neural Networks on Advances in Neural Networks
Performance evaluation of density-based clustering methods

Information Sciences: an International Journal
Towards supporting expert evaluation of clustering results using a data mining process model

Information Sciences: an International Journal
Enhanced soft subspace clustering integrating within-cluster and between-cluster information

Pattern Recognition
Parallel clustering of high dimensional data by integrating multi-objective genetic algorithm with divide and conquer

Applied Intelligence
A new separation measure for improving the effectiveness of validity indices

Information Sciences: an International Journal
Building a Decision Cluster Forest Model to Classify High Dimensional Data with Multi-classes

ACML '09 Proceedings of the 1st Asian Conference on Machine Learning: Advances in Machine Learning
PCA-Guided k-Means with Variable Weighting and Its Application to Document Clustering

MDAI '09 Proceedings of the 6th International Conference on Modeling Decisions for Artificial Intelligence
From variable weighting to cluster characterization in topographic unsupervised learning

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
SKM-SNP: SNP markers detection method

Journal of Biomedical Informatics
Quantization-based clustering algorithm

Pattern Recognition
A graph based framework for clustering and characterization of SOM

ICANN'10 Proceedings of the 20th international conference on Artificial neural networks: Part III
Dampster-Shafer evidence theory based multi-characteristics fusion for clustering evaluation

RSKT'10 Proceedings of the 5th international conference on Rough set and knowledge technology
Feature-weighted mountain method with its application to color image segmentation

RSKT'10 Proceedings of the 5th international conference on Rough set and knowledge technology
A k-means type clustering algorithm for subspace clustering of mixed numeric and categorical datasets

Pattern Recognition Letters
Row-constained method for documents clustering

ICCOM'06 Proceedings of the 10th WSEAS international conference on Communications
Class-dependent projection based method for text categorization

Pattern Recognition Letters
A subspace decision cluster classifier for text classification

Expert Systems with Applications: An International Journal
A novel attribute weighting algorithm for clustering high-dimensional categorical data

Pattern Recognition
Weight selection in W-K-means algorithm with an application in color image segmentation

Computers & Mathematics with Applications
Sample-weighted clustering methods

Computers & Mathematics with Applications
A feature group weighting method for subspace clustering of high-dimensional data

Pattern Recognition
Eigenvector sensitive feature selection for spectral clustering

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
A context-aware data mining process model based framework for supporting evaluation of data mining results

Expert Systems with Applications: An International Journal
EEW-SC: Enhanced Entropy-Weighting Subspace Clustering for high dimensional gene expression data clustering analysis

Applied Soft Computing
Minkowski metric, feature weighting and anomalous cluster initializing in K-Means clustering

Pattern Recognition
A novel fuzzy c-means clustering algorithm

RSKT'06 Proceedings of the First international conference on Rough Sets and Knowledge Technology
Gene expression data analysis with the clustering method based on an improved quantum-behaved Particle Swarm Optimization

Engineering Applications of Artificial Intelligence
Weighted fuzzy c-means clustering based on double coding genetic algorithm

ICIC'06 Proceedings of the 2006 international conference on Intelligent Computing - Volume Part I
DHCC: Divisive hierarchical clustering of categorical data

Data Mining and Knowledge Discovery
Feature interaction in subspace clustering using the Choquet integral

Pattern Recognition
A cluster centers initialization method for clustering categorical data

Expert Systems with Applications: An International Journal
Simultaneous pattern and variable weighting during topological clustering

ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part I
Clustering by integrating multi-objective optimization with weighted k-means and validity analysis

IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning
A non-parametric method for data clustering with optimal variable weighting

IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning
An unsupervised feature selection framework based on clustering

PAKDD'11 Proceedings of the 15th international conference on New Frontiers in Applied Data Mining
Partitive clustering (K-means family)

Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Attribute value weighting in k-modes clustering

Expert Systems with Applications: An International Journal
Subspace clustering

Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Identification of optimal cluster centroid of multi-variable functions for clustering concept-drift categorical data

Proceedings of the International Conference on Advances in Computing, Communications and Informatics
Robust local feature weighting hard c-means clustering algorithm

IScIDE'11 Proceedings of the Second Sino-foreign-interchange conference on Intelligent Science and Intelligent Data Engineering
Effective fuzzy semantic clustering scheme for decentralised network through multi-domain ontology model

International Journal of Metadata, Semantics and Ontologies
Weighting features for partition around medoids using the minkowski metric

IDA'12 Proceedings of the 11th international conference on Advances in Intelligent Data Analysis
On initializations for the minkowski weighted k-means

IDA'12 Proceedings of the 11th international conference on Advances in Intelligent Data Analysis
A novel fuzzy clustering algorithm with between-cluster information for categorical data

Fuzzy Sets and Systems
An ensemble of decision cluster crotches for classification of high dimensional data

Knowledge-Based Systems
A weighting k-modes algorithm for subspace clustering of categorical data

Neurocomputing
RPCA: a novel preprocessing method for PCA

Advances in Artificial Intelligence
Fuzzy partition based soft subspace clustering and its applications in high dimensional data

Information Sciences: an International Journal
Local-to-global semi-supervised feature selection

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Effective fuzzy semantic clustering scheme for decentralised network through multi-domain ontology model

International Journal of Metadata, Semantics and Ontologies
An optimized method for selection of the initial centers of k-means clustering

IUKM'13 Proceedings of the 2013 international conference on Integrated Uncertainty in Knowledge Modelling and Decision Making
Central clustering of categorical data with automated feature weighting

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Hartigan's K-means versus Lloyd's K-means: is it time for a change?

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Dynamic clustering of histogram data based on adaptive squared Wasserstein distances

Expert Systems with Applications: An International Journal
Category role aided market segmentation approach to convenience store chain category management

Decision Support Systems
The k-modes type clustering plus between-cluster information for categorical data

Neurocomputing
A scatter method for data and variable importance evaluation

Integrated Computer-Aided Engineering
Robust local feature weighting hard c-means clustering algorithm

Neurocomputing

Quantified Score

Hi-index	0.15

Visualization

Abstract

This paper proposes a k\hbox{-}{\rm{means}} type clustering algorithm that can automatically calculate variable weights. A new step is introduced to the k\hbox{-}{\rm{means}} clustering process to iteratively update variable weights based on the current partition of data and a formula for weight calculation is proposed. The convergency theorem of the new clustering process is given. The variable weights produced by the algorithm measure the importance of variables in clustering and can be used in variable selection in data mining applications where large and complex real data are often involved. Experimental results on both synthetic and real data have shown that the new algorithm outperformed the standard k\hbox{-}{\rm{means}} type algorithms in recovering clusters in data.