Automatic aspect discrimination in data clustering

  • Authors:
  • Danilo Horta;Ricardo J. G. B. Campello

  • Affiliations:
  • Instituto de Ciências Matemáticas e de Computação - ICMC, Universidade de São Paulo - Campus de São Carlos, Caixa Postal 668, 13560-970 São Carlos-SP, Brazil;Instituto de Ciências Matemáticas e de Computação - ICMC, Universidade de São Paulo - Campus de São Carlos, Caixa Postal 668, 13560-970 São Carlos-SP, Brazil

  • Venue:
  • Pattern Recognition
  • Year:
  • 2012

Quantified Score

Hi-index 0.01

Visualization

Abstract

The attributes describing a data set may often be arranged in meaningful subsets, each of which corresponds to a different aspect of the data. An unsupervised algorithm (SCAD) that simultaneously performs fuzzy clustering and aspects weighting was proposed in the literature. However, SCAD may fail and halt given certain conditions. To fix this problem, its steps are modified and then reordered to reduce the number of parameters required to be set by the user. In this paper we prove that each step of the resulting algorithm, named ASCAD, globally minimizes its cost-function with respect to the argument being optimized. The asymptotic analysis of ASCAD leads to a time complexity which is the same as that of fuzzy c-means. A hard version of the algorithm and a novel validity criterion that considers aspect weights in order to estimate the number of clusters are also described. The proposed method is assessed over several artificial and real data sets.