A comparison of internal and external cluster validation indexes

  • Authors:
  • Eréndira Rendón;Itzel M. Abundez;Citlalih Gutierrez;Sergio Díaz Zagal;Alejandra Arizmendi;Elvia M. Quiroz;H. Elsa Arzate

  • Affiliations:
  • División de Estudios de Postgrado e Investigación, Instituto Tecnológico de Toluca, Edo. de México, México;División de Estudios de Postgrado e Investigación, Instituto Tecnológico de Toluca, Edo. de México, México;Instituto Tecnológico de Toluca, Edo. de México, México;División de Estudios de Postgrado e Investigación, Instituto Tecnológico de Toluca, Edo. de México, México;División de Estudios de Postgrado e Investigación, Instituto Tecnológico de Toluca, Edo. de México, México;Instituto Tecnológico de Toluca, Edo. de México, México;Instituto Tecnológico de Toluca, Edo. de México, México

  • Venue:
  • AMERICAN-MATH'11/CEA'11 Proceedings of the 2011 American conference on applied mathematics and the 5th WSEAS international conference on Computer engineering and applications
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The procedure of evaluating the results of a clustering algorithm is known under the term cluster validity. In general terms, cluster validity criteria can be classified in three categories: internal, external and relative. In this work we focus on the external and internal criteria. External indexes require a priori data for the purposes of evaluating the results of a clustering algorithm, whereas internal indexes do not. Consequently, different types of indexes are used to solve different types of problems and indexes selection depends on the kind of available data. It is interesting to note that, type of information or algorithm notwithstanding, they provided the highest degree of accuracy in group determining. That is why in this paper we show a comparison between external and internal indexes. Results obtained in this study indicate that internal indexes are more accurate in group determining in a given clustering structure. Five internal indexes were used in this study: BIC, CH, DB, SIL and DUNN. The groups that were used were obtained through clustering algorithms K-means and Bissecting-K-means.