Clustering mixed data

Authors:
Lynette Hunt;Murray Jorgensen
Affiliations:
Department of Statistics, University of Waikato, Hamilton, New Zealand;Department of Statistics, University of Waikato, Hamilton, New Zealand
Venue:
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Year:
2011

Citing 6
Cited 1

Statistical analysis with missing data

Statistical analysis with missing data
Bayesian classification (AutoClass): theory and results

Advances in knowledge discovery and data mining
Identifiable finite mixtures of location models for clustering mixed-mode data

Statistics and Computing
MML clustering of multi-state, Poisson, vonMises circular and Gaussian distributions

Statistics and Computing
A k-mean clustering algorithm for mixed numeric and categorical data

Data & Knowledge Engineering
Cluster Analysis

Cluster Analysis

Determining the number of clusters using information entropy for mixed data

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

Mixture model clustering proceeds by fitting a finite mixture of multivariate distributions to data, the fitted mixture density then being used to allocate the data to one of the components. Common model formulations assume that either all the attributes are continuous or all the attributes are categorical. In this paper, we consider options for model formulation in the more practical case of mixed data: multivariate data sets that contain both continuous and categorical attributes. © 2011 John Wiley & Sons, Inc. WIREs Data Mining Knowl Discov 2011 1 352–361 DOI: 10.1002/widm.33