Optimal transportation plans and convergence in distribution
Journal of Multivariate Analysis
The Earth Mover's Distance as a Metric for Image Retrieval
International Journal of Computer Vision
Automated Variable Weighting in k-Means Type Clustering
IEEE Transactions on Pattern Analysis and Machine Intelligence
Dynamic clustering for interval data based on L2 distance
Computational Statistics
An Entropy Weighting k-Means Algorithm for Subspace Clustering of High-Dimensional Sparse Data
IEEE Transactions on Knowledge and Data Engineering
A Clustering Method for Mixed Feature-Type Symbolic Data using Adaptive Squared Euclidean Distances
HIS '07 Proceedings of the 7th International Conference on Hybrid Intelligent Systems
Developing a feature weight self-adjustment mechanism for a K-means clustering algorithm
Computational Statistics & Data Analysis
Unsupervised pattern recognition models for mixed feature-type symbolic data
Pattern Recognition Letters
Dynamic clustering of interval-valued data based on adaptive quadratic distances
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Data clustering: 50 years beyond K-means
Pattern Recognition Letters
Pattern Recognition Letters
Survey of clustering algorithms
IEEE Transactions on Neural Networks
Copula analysis of mixture models
Computational Statistics
A hierarchical modeling approach for clustering probability density functions
Computational Statistics & Data Analysis
Hi-index | 12.05 |
This paper presents a Dynamic Clustering Algorithm for histogram data with an automatic weighting step of the variables by using adaptive distances. The Dynamic Clustering Algorithm is a k-means-like algorithm for clustering a set of objects into a predefined number of classes. Histogram data are realizations of particular set-valued descriptors defined in the context of Symbolic Data Analysis. We propose to use the @?"2 Wasserstein distance for clustering histogram data and two novel adaptive distance based clustering schemes. The @?"2 Wasserstein distance allows to express the variability of a set of histograms in two components: the first related to the variability of their averages and the second to the variability of the histograms related to different size and shape. The weighting step aims to take into account global and local adaptive distances as well as two components of the variability of a set of histograms. To evaluate the clustering results, we extend some classic partition quality indexes when the proposed adaptive distances are used in the clustering criterion function. Examples on synthetic and real-world datasets corroborate the proposed clustering procedure.