Accelerating exact k-means algorithms with geometric reasoning
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering for edge-cost minimization (extended abstract)
STOC '00 Proceedings of the thirty-second annual ACM symposium on Theory of computing
An Efficient k-Means Clustering Algorithm: Analysis and Implementation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Refining Initial Points for K-Means Clustering
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
X-means: Extending K-means with Efficient Estimation of the Number of Clusters
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Information theoretic measures for clusterings comparison: is a correction for chance necessary?
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Hi-index | 0.00 |
We address the clustering problem in the context of exploratory data analysis, where data sets are investigated under different and desirably contrasting perspectives. In this scenario where, for flexibility, solutions are evaluated by criterion functions, we introduce and evaluate a generalized and efficient version of the incremental one-by-one clustering algorithm of MacQueen (1967). Unlike the widely adopted two-phase algorithm developed by Lloyd (1957), our approach does not rely on the gradient of the criterion function being optimized, offering the key advantage of being able to deal with non-convex criteria. After an extensive experimental analysis using real-world data sets with a more flexible, non-convex criterion function, we obtained results that are considerably better than those produced with the k-means criterion, making our algorithm an invaluable tool for exploratory clustering applications.