Dynamic clustering of histogram data based on adaptive squared Wasserstein distances

Authors:
Antonio Irpino;Rosanna Verde;Francisco De A. T. De Carvalho
Affiliations:
-;-;-
Venue:
Expert Systems with Applications: An International Journal
Year:
2014

Citing 17
Cited 0

Optimal transportation plans and convergence in distribution

Journal of Multivariate Analysis
The Earth Mover's Distance as a Metric for Image Retrieval

International Journal of Computer Vision
Automated Variable Weighting in k-Means Type Clustering

IEEE Transactions on Pattern Analysis and Machine Intelligence
Dynamic clustering for interval data based on L2 distance

Computational Statistics
An Entropy Weighting k-Means Algorithm for Subspace Clustering of High-Dimensional Sparse Data

IEEE Transactions on Knowledge and Data Engineering
A Clustering Method for Mixed Feature-Type Symbolic Data using Adaptive Squared Euclidean Distances

HIS '07 Proceedings of the 7th International Conference on Hybrid Intelligent Systems
Developing a feature weight self-adjustment mechanism for a K-means clustering algorithm

Computational Statistics & Data Analysis
Partitional clustering algorithms for symbolic interval data based on single adaptive distances

Pattern Recognition
Enhanced soft subspace clustering integrating within-cluster and between-cluster information

Pattern Recognition
Unsupervised pattern recognition models for mixed feature-type symbolic data

Pattern Recognition Letters
Dynamic clustering of interval-valued data based on adaptive quadratic distances

IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Data clustering: 50 years beyond K-means

Pattern Recognition Letters
A k-means type clustering algorithm for subspace clustering of mixed numeric and categorical datasets

Pattern Recognition Letters
A feature group weighting method for subspace clustering of high-dimensional data

Pattern Recognition
Survey of clustering algorithms

IEEE Transactions on Neural Networks
Copula analysis of mixture models

Computational Statistics
A hierarchical modeling approach for clustering probability density functions

Computational Statistics & Data Analysis

Quantified Score

Hi-index	12.05

Visualization

Abstract

This paper presents a Dynamic Clustering Algorithm for histogram data with an automatic weighting step of the variables by using adaptive distances. The Dynamic Clustering Algorithm is a k-means-like algorithm for clustering a set of objects into a predefined number of classes. Histogram data are realizations of particular set-valued descriptors defined in the context of Symbolic Data Analysis. We propose to use the @?"2 Wasserstein distance for clustering histogram data and two novel adaptive distance based clustering schemes. The @?"2 Wasserstein distance allows to express the variability of a set of histograms in two components: the first related to the variability of their averages and the second to the variability of the histograms related to different size and shape. The weighting step aims to take into account global and local adaptive distances as well as two components of the variability of a set of histograms. To evaluate the clustering results, we extend some classic partition quality indexes when the proposed adaptive distances are used in the clustering criterion function. Examples on synthetic and real-world datasets corroborate the proposed clustering procedure.