Minkowski metric, feature weighting and anomalous cluster initializing in K-Means clustering

Authors:
Renato Cordeiro de Amorim;Boris Mirkin
Affiliations:
Department of Computer Science and Information Systems, Birkbeck University of London, Malet Street, London WC1E 7HX, UK;Department of Computer Science and Information Systems, Birkbeck University of London, Malet Street, London WC1E 7HX, UK and Department of Data Analysis and Machine Intelligence, National Research ...
Venue:
Pattern Recognition
Year:
2012

Citing 29
Cited 4

Pattern Recognition with Fuzzy Objective Function Algorithms

Pattern Recognition with Fuzzy Objective Function Algorithms
Clustering Algorithms

Clustering Algorithms
On the Surprising Behavior of Distance Metrics in High Dimensional Spaces

ICDT '01 Proceedings of the 8th International Conference on Database Theory
Top-Down Induction of Clustering Trees

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
X-means: Extending K-means with Efficient Estimation of the Number of Clusters

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data

Machine Learning
Feature Weighting in k-Means Clustering

Machine Learning
Integrating constraints and metric learning in semi-supervised clustering

ICML '04 Proceedings of the twenty-first international conference on Machine learning
On a resampling approach for tests on the number of clusters with mixture model-based clustering of tissue samples

Journal of Multivariate Analysis
Clustering For Data Mining: A Data Recovery Approach (Chapman & Hall/Crc Computer Science)

Clustering For Data Mining: A Data Recovery Approach (Chapman & Hall/Crc Computer Science)
Automated Variable Weighting in k-Means Type Clustering

IEEE Transactions on Pattern Analysis and Machine Intelligence
Clustering with Bregman Divergences

The Journal of Machine Learning Research
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Learning multicriteria fuzzy classification method PROAFTN from data

Computers and Operations Research
The Concentration of Fractional Distances

IEEE Transactions on Knowledge and Data Engineering
Initializing K-means Batch Clustering: A Critical Evaluation of Several Techniques

Journal of Classification
A survey of kernel and spectral methods for clustering

Pattern Recognition
Developing a feature weight self-adjustment mechanism for a K-means clustering algorithm

Computational Statistics & Data Analysis
Unsupervised feature selection using clustering ensembles and population based incremental learning algorithm

Pattern Recognition
Non-negative matrix factorization for semi-supervised data clustering

Knowledge and Information Systems
An initialization method for the K-Means algorithm using neighborhood model

Computers & Mathematics with Applications
Single point iterative weighted fuzzy C-means clustering algorithm for remote sensing image segmentation

Pattern Recognition
Worst-Case and Smoothed Analysis of the ICP Algorithm, with an Application to the k-Means Method

SIAM Journal on Computing
Fuzzy C-Mean Algorithm with Morphology Similarity Distance

FSKD '09 Proceedings of the 2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery - Volume 03
Evaluating clustering in subspace projections of high dimensional data

Proceedings of the VLDB Endowment
Data clustering: 50 years beyond K-means

Pattern Recognition Letters
The P-Norm Push: A Simple Convex Ranking Algorithm that Concentrates at the Top of the List

The Journal of Machine Learning Research
Intelligent Choice of the Number of Clusters in K-Means Clustering: An Experimental Study with Different Cluster Spreads

Journal of Classification
The p-norm generalization of the LMS algorithm for adaptive filtering

IEEE Transactions on Signal Processing

Partitive clustering (K-means family)

Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Weighting features for partition around medoids using the minkowski metric

IDA'12 Proceedings of the 11th international conference on Advances in Intelligent Data Analysis
On initializations for the minkowski weighted k-means

IDA'12 Proceedings of the 11th international conference on Advances in Intelligent Data Analysis
Joint image denoising using adaptive principal component analysis and self-similarity

Information Sciences: an International Journal

Quantified Score

Hi-index	0.01

Visualization

Abstract

This paper represents another step in overcoming a drawback of K-Means, its lack of defense against noisy features, using feature weights in the criterion. The Weighted K-Means method by Huang et al. (2008, 2004, 2005) [5-7] is extended to the corresponding Minkowski metric for measuring distances. Under Minkowski metric the feature weights become intuitively appealing feature rescaling factors in a conventional K-Means criterion. To see how this can be used in addressing another issue of K-Means, the initial setting, a method to initialize K-Means with anomalous clusters is adapted. The Minkowski metric based method is experimentally validated on datasets from the UCI Machine Learning Repository and generated sets of Gaussian clusters, both as they are and with additional uniform random noise features, and appears to be competitive in comparison with other K-Means based feature weighting algorithms.