A Classification EM algorithm for clustering and two stochastic versions
Computational Statistics & Data Analysis - Special issue on optimization techniques in statistics
A conceptual version of the K-means algorithm
Pattern Recognition Letters
Computational Statistics & Data Analysis - Special issue on classification
Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values
Data Mining and Knowledge Discovery
TCSOM: Clustering Transactions Using Self-Organizing Map
Neural Processing Letters
Adherence clustering: an efficient method for mining market-basket clusters
Information Systems
k-ANMI: A mutual information based clustering algorithm for categorical data
Information Fusion
Adherence clustering: an efficient method for mining market-basket clusters
Information Systems
A new feature weighted fuzzy clustering algorithm
RSFDGrC'05 Proceedings of the 10th international conference on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing - Volume Part I
Weighted topological clustering for categorical data
ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part I
Hi-index | 0.00 |
Clustering methods often come down to the optimization of a numeric criterion defined from a distance or from a dissimilarity measure. It is possible to show that this problem is often equivalent to the estimation of the parameters of a probabilistic model under the classification likelihood approach. For instance, we know that the inertia criterion optimized under the k-means algorithm corresponds to the hypothesis of a population arising from a Gaussian mixture. In this paper, we propose an adapted mixture model for categorical data. Using the classification likelihood approach, we develop the Classification EM algorithm (CEM) to estimate the parameters of the mixture model. With our probabilistic model, the data are not denatured and the estimated parameters readily indicate the characteristics of the clusters. This probabilistic approach gives an interpretation of the criterion optimized by the k-modes algorithm which is an extension of k-means to categorical attributes and allows us to study the behavior of this algorithm.