An empirical evaluation of different initializations on the number of k-means iterations

Authors:
Renato Cordeiro de Amorim
Affiliations:
Department of Computer Science and Information Systems, Birkbeck University of London, UK
Venue:
MICAI'12 Proceedings of the 11th Mexican international conference on Advances in Artificial Intelligence - Volume Part I
Year:
2012

Citing 11
Cited 0

An empirical comparison of four initialization methods for the K-Means algorithm

Pattern Recognition Letters
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Pattern Recognition Algorithms for Data Mining: Scalability, Knowledge Discovery, and Soft Granular Computing

Pattern Recognition Algorithms for Data Mining: Scalability, Knowledge Discovery, and Soft Granular Computing
Clustering For Data Mining: A Data Recovery Approach (Chapman & Hall/Crc Computer Science)

Clustering For Data Mining: A Data Recovery Approach (Chapman & Hall/Crc Computer Science)
Initializing K-means Batch Clustering: A Critical Evaluation of Several Techniques

Journal of Classification
k-means++: the advantages of careful seeding

SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Constrained Intelligent K-Means: Improving Results with Limited Previous Knowledge.

ADVCOMP '08 Proceedings of the 2008 The Second International Conference on Advanced Engineering Computing and Applications in Sciences
Clustering

Clustering
Data clustering: 50 years beyond K-means

Pattern Recognition Letters
Intelligent Choice of the Number of Clusters in K-Means Clustering: An Experimental Study with Different Cluster Spreads

Journal of Classification
On initializations for the minkowski weighted k-means

IDA'12 Proceedings of the 11th international conference on Advances in Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents an analysis of the number of iterations K-Means takes to converge under different initializations. We have experimented with seven initialization algorithms in a total of 37 real and synthetic datasets. We have found that hierarchical-based initializations tend to be most effective at reducing the number of iterations, especially a divisive algorithm using the Ward criterion when applied to real datasets.