A simulated annealing algorithm for the clustering problem
Pattern Recognition
An Interior Point Algorithm for Minimum Sum-of-Squares Clustering
SIAM Journal on Scientific Computing
Variable Neighborhood Decomposition Search
Journal of Heuristics
Knowledge Acquisition Via Incremental Conceptual Clustering
Machine Learning
The Anchors Hierarchy: Using the Triangle Inequality to Survive High Dimensional Data
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Better streaming algorithms for clustering problems
Proceedings of the thirty-fifth annual ACM symposium on Theory of computing
Neural Networks - 2006 Special issue: Advances in self-organizing maps--WSOM'05
Modified global k-means algorithm for clustering in gene expression data sets
WISB '06 Proceedings of the 2006 workshop on Intelligent systems for bioinformatics - Volume 73
A Branch and Bound Clustering Algorithm
IEEE Transactions on Computers
Modified global k-means algorithm for minimum sum-of-squares clustering problems
Pattern Recognition
The hyperbolic smoothing clustering method
Pattern Recognition
Fast global k-means clustering using cluster membership and inequality
Pattern Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
Partitive clustering (K-means family)
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Improved Parameterless K-Means: Auto-Generation Centroids and Distance Data Point Clusters
International Journal of Information Retrieval Research
A sample-based hierarchical adaptive K-means clustering method for large-scale video retrieval
Knowledge-Based Systems
Fast global k-means clustering based on local geometrical information
Information Sciences: an International Journal
A fast partitioning algorithm and its application to earthquake investigation
Computers & Geosciences
An Efficient Hybrid Artificial Bee Colony Algorithm for Customer Segmentation in Mobile E-commerce
Journal of Electronic Commerce in Organizations
Hi-index | 0.01 |
The k-means algorithm and its variations are known to be fast clustering algorithms. However, they are sensitive to the choice of starting points and are inefficient for solving clustering problems in large datasets. Recently, incremental approaches have been developed to resolve difficulties with the choice of starting points. The global k-means and the modified global k-means algorithms are based on such an approach. They iteratively add one cluster center at a time. Numerical experiments show that these algorithms considerably improve the k-means algorithm. However, they require storing the whole affinity matrix or computing this matrix at each iteration. This makes both algorithms time consuming and memory demanding for clustering even moderately large datasets. In this paper, a new version of the modified global k-means algorithm is proposed. We introduce an auxiliary cluster function to generate a set of starting points lying in different parts of the dataset. We exploit information gathered in previous iterations of the incremental algorithm to eliminate the need of computing or storing the whole affinity matrix and thereby to reduce computational effort and memory usage. Results of numerical experiments on six standard datasets demonstrate that the new algorithm is more efficient than the global and the modified global k-means algorithms.