Vector quantization and signal compression
Vector quantization and signal compression
SCG '94 Proceedings of the tenth annual symposium on Computational geometry
A Robust Competitive Clustering Algorithm With Applications in Computer Vision
IEEE Transactions on Pattern Analysis and Machine Intelligence
Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values
Data Mining and Knowledge Discovery
Practical Data-Oriented Microaggregation for Statistical Disclosure Control
IEEE Transactions on Knowledge and Data Engineering
An Efficient k-Means Clustering Algorithm: Analysis and Implementation
IEEE Transactions on Pattern Analysis and Machine Intelligence
K-means Clustering Algorithm for Categorical Attributes
DaWaK '99 Proceedings of the First International Conference on Data Warehousing and Knowledge Discovery
k-anonymity: a model for protecting privacy
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Integrating constraints and metric learning in semi-supervised clustering
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Ordinal, Continuous and Heterogeneous k-Anonymity Through Microaggregation
Data Mining and Knowledge Discovery
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition and Machine Learning (Information Science and Statistics)
Multinomial mixture model with feature selection for text clustering
Knowledge-Based Systems
Discovering unexpected documents in corpora
Knowledge-Based Systems
k-Means Has Polynomial Smoothed Complexity
FOCS '09 Proceedings of the 2009 50th Annual IEEE Symposium on Foundations of Computer Science
A classification algorithm based on local cluster centers with a few labeled training examples
Knowledge-Based Systems
Semi-Supervised Learning
Data clustering with size constraints
Knowledge-Based Systems
IEEE Transactions on Information Theory
Least squares quantization in PCM
IEEE Transactions on Information Theory
Survey of clustering algorithms
IEEE Transactions on Neural Networks
Semantically-grounded construction of centroids for datasets with textual attributes
Knowledge-Based Systems
Hi-index | 0.00 |
Since the advent of data clustering, the original formulation of the clustering problem has been enriched to incorporate a number of twists to widen its range of application. In particular, recent heuristic approaches have proposed to incorporate restrictions on the size of the clusters, while striving to minimize a measure of dissimilarity within them. Such size constraints effectively constitute a way to exploit prior knowledge, readily available in many scenarios, which can lead to an improved performance in the clustering obtained. In this paper, we build upon a modification of the celebrated k-means method resorting to a similar alternating optimization procedure, endowed with additive partition weights controlling the size of the partitions formed, adjusted by means of the Levenberg-Marquardt algorithm. We propose several further variations on this modification, in which different kinds of additional information are present. We report experimental results on various standardized datasets, demonstrating that our approaches outperform existing heuristics for size-constrained clustering. The running-time complexity of our proposal is assessed experimentally by means of a power-law regression analysis.