A Genetic Algorithm Using Hyper-Quadtrees for Low-Dimensional K-means Clustering

Authors:
Michael Laszlo;Sumitra Mukherjee
Affiliations:
-;IEEE
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
2006

Citing 13
Cited 15

Algorithms for clustering data

Algorithms for clustering data
The design and analysis of spatial data structures

The design and analysis of spatial data structures
A near-optimal initial seed value selection in K-means algorithm using a genetic algorithm

Pattern Recognition Letters
Data clustering: a review

ACM Computing Surveys (CSUR)
An empirical comparison of four initialization methods for the K-Means algorithm

Pattern Recognition Letters
Genetic Algorithms in Search, Optimization and Machine Learning

Genetic Algorithms in Search, Optimization and Machine Learning
Interactive Pattern Recognition

Interactive Pattern Recognition
Alternatives to the k-means algorithm that find better clusterings

Proceedings of the eleventh international conference on Information and knowledge management
Mathematical Programming in Data Mining

Data Mining and Knowledge Discovery
An Efficient k-Means Clustering Algorithm: Analysis and Implementation

IEEE Transactions on Pattern Analysis and Machine Intelligence
An evolutionary technique based on K-means algorithm for optimal clustering in RN

Information Sciences—Applications: An International Journal
Nonparametric genetic clustering: comparison of validity indices

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Genetic K-means algorithm

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

Feature-guided clustering of multi-dimensional flow cytometry datasets

Journal of Biomedical Informatics
Unsupervised minor prototype detection using an adaptive population partitioning algorithm

Pattern Recognition
A genetic algorithm that exchanges neighboring centers for k-means clustering

Pattern Recognition Letters
Masseter segmentation using an improved watershed algorithm with unsupervised classification

Computers in Biology and Medicine
2008 Special Issue: Interactive data analysis and clustering of genomic data

Neural Networks
Clustering of document collection - A weighting approach

Expert Systems with Applications: An International Journal
A genetic algorithm with gene rearrangement for K-means clustering

Pattern Recognition
A robust dynamic niching genetic algorithm with niche migration for automatic clustering problem

Pattern Recognition
A novel algorithm for triangle non-symmetry and anti-packing pattern representation model of gray images

ICIC'07 Proceedings of the intelligent computing 3rd international conference on Advanced intelligent computing theories and applications
Quantization-based clustering algorithm

Pattern Recognition
A time-efficient pattern reduction algorithm for k-means clustering

Information Sciences: an International Journal
An initialization method to simultaneously find initial cluster centers and the number of clusters for clustering categorical data

Knowledge-Based Systems
An algorithm for high-dimensional traffic data clustering

FSKD'06 Proceedings of the Third international conference on Fuzzy Systems and Knowledge Discovery
A cluster centers initialization method for clustering categorical data

Expert Systems with Applications: An International Journal
Fast global k-means clustering based on local geometrical information

Information Sciences: an International Journal

Quantified Score

Hi-index	0.15

Visualization

Abstract

The k-means algorithm is widely used for clustering because of its computational efficiency. Given n points in d\hbox{-}{\rm{dimensional}} space and the number of desired clusters k, k-means seeks a set of k cluster centers so as to minimize the sum of the squared Euclidean distance between each point and its nearest cluster center. However, the algorithm is very sensitive to the initial selection of centers and is likely to converge to partitions that are significantly inferior to the global optimum. We present a genetic algorithm (GA) for evolving centers in the k-means algorithm that simultaneously identifies good partitions for a range of values around a specified k. The set of centers is represented using a hyper-quadtree constructed on the data. This representation is exploited in our GA to generate an initial population of good centers and to support a novel crossover operation that selectively passes good subsets of neighboring centers from parents to offspring by swapping subtrees. Experimental results indicate that our GA finds the global optimum for data sets with known optima and finds good solutions for large simulated data sets.