A method for initialising the K-means clustering algorithm using kd-trees

Authors:
Stephen J. Redmond;Conor Heneghan
Affiliations:
Department of Electronic Engineering, University College Dublin, Belfield, Dublin 4, Ireland;Department of Electronic Engineering, University College Dublin, Belfield, Dublin 4, Ireland
Venue:
Pattern Recognition Letters
Year:
2007

Citing 8
Cited 22

A near-optimal initial seed value selection in K-means algorithm using a genetic algorithm

Pattern Recognition Letters
New methods for the initialisation of clusters

Pattern Recognition Letters
Data clustering: a review

ACM Computing Surveys (CSUR)
An empirical comparison of four initialization methods for the K-Means algorithm

Pattern Recognition Letters
Density-Based Multiscale Data Condensation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Refining Initial Points for K-Means Clustering

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Cluster center initialization algorithm for K-means clustering

Pattern Recognition Letters
A comparison of several vector quantization codebook generation approaches

IEEE Transactions on Image Processing

An efficient k'-means clustering algorithm

Pattern Recognition Letters
Tracking Data Structures Coherency in Animated Ray Tracing: Kalman and Wiener Filters Approach

ISVC '08 Proceedings of the 4th International Symposium on Advances in Visual Computing
Clustering Multivariate Normal Distributions

Emerging Trends in Visual Computing
Fast approximate spectral clustering

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
A selection model for optimal fuzzy clustering algorithm and number of clusters based on competitive comprehensive fuzzy evaluation

IEEE Transactions on Fuzzy Systems
Fast kd-tree construction for 3D-rendering algorithms like ray tracing

ISVC'07 Proceedings of the 3rd international conference on Advances in visual computing - Volume Part II
K-means clustering seeds initialization based on centrality, sparsity, and isotropy

IDEAL'09 Proceedings of the 10th international conference on Intelligent data engineering and automated learning
A robust iterative refinement clustering algorithm with smoothing search space

Knowledge-Based Systems
Signal identification of block orthogonal modulations

RWS'10 Proceedings of the 2010 IEEE conference on Radio and wireless symposium
Solving the minimum sum-of-squares clustering problem by hyperbolic smoothing and partition into boundary and gravitational regions

Pattern Recognition
Improving the performance of k-means for color quantization

Image and Vision Computing
Application of K-Medoids with Kd-Tree for Software Fault Prediction

ACM SIGSOFT Software Engineering Notes
An improved rough clustering using discernibility based initial seed computation

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
Particle swarm optimization based K-means clustering approach for security assessment in power systems

Expert Systems with Applications: An International Journal
MicroCBR: A case-based reasoning architecture for the classification of microarray data

Applied Soft Computing
Graph based k-means clustering

Signal Processing
A BIRCH-Based clustering method for large time series databases

PAKDD'11 Proceedings of the 15th international conference on New Frontiers in Applied Data Mining
Clustering biological data using voronoi diagram

ADCONS'11 Proceedings of the 2011 international conference on Advanced Computing, Networking and Security
Objective function-based clustering

Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
A comparative study of efficient initialization methods for the k-means clustering algorithm

Expert Systems with Applications: An International Journal
Improved Parameterless K-Means: Auto-Generation Centroids and Distance Data Point Clusters

International Journal of Information Retrieval Research
CRUDAW: a novel fuzzy technique for clustering records following user defined attribute weights

AusDM '12 Proceedings of the Tenth Australasian Data Mining Conference - Volume 134

Quantified Score

Hi-index	0.10

Visualization

Abstract

We present a method for initialising the K-means clustering algorithm. Our method hinges on the use of a kd-tree to perform a density estimation of the data at various locations. We then use a modification of Katsavounidis' algorithm, which incorporates this density information, to choose K seeds for the K-means algorithm. We test our algorithm on 36 synthetic datasets, and 2 datasets from the UCI Machine Learning Repository, and compare with 15 runs of Forgy's random initialisation method, Katsavounidis' algorithm, and Bradley and Fayyad's method.