Using the Negentropy Increment to Determine the Number of Clusters

  • Authors:
  • Luis F. Lago-Fernández;Fernando Corbacho

  • Affiliations:
  • Departamento de Ingeniería Informática, Escuela Politécnica Superior, Universidad Autónoma de Madrid, Madrid, Spain 28049;Cognodata Consulting, Madrid, Spain 28010

  • Venue:
  • IWANN '09 Proceedings of the 10th International Work-Conference on Artificial Neural Networks: Part I: Bio-Inspired Systems: Computational and Ambient Intelligence
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We introduce a new validity index for crisp clustering that is based on the average normality of the clusters. A normal cluster is optimal in the sense of maximum uncertainty, or minimum structure, and so performing further partitions on it will not reveal additional substructures. To characterize the normality of a cluster we use the negentropy, a standard measure of distance to normality which evaluates the difference between the cluster's entropy and the entropy of a normal distribution with the same covariance matrix. Although the definition of the negentropy involves the differential entropy, we show that it is possible to avoid its explicit computation by considering only negentropy increments with respect to the initial data distribution. The resulting negentropy increment validity index only requires the computation of determinants of covariance matrices. We have applied the index to randomly generated problems, and show that it provides better results than other indices for the assessment of the number of clusters.