An automatic method to determine the number of clusters using decision-theoretic rough set

  • Authors:
  • Hong Yu;Zhanguo Liu;Guoyin Wang

  • Affiliations:
  • -;-;-

  • Venue:
  • International Journal of Approximate Reasoning
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

Clustering provides a common means of identifying structure in complex data, and there is renewed interest in clustering as a tool for the analysis of large data sets in many fields. Determining the number of clusters in a data set is one of the most challenging and difficult problems in cluster analysis. To combat the problem, this paper proposes an efficient automatic method by extending the decision-theoretic rough set model to clustering. A new clustering validity evaluation function is designed based on the risk calculated by loss functions and possibilities. Then a hierarchical clustering algorithm, ACA-DTRS algorithm, is proposed, which is proved to stop automatically at the perfect number of clusters without manual interference. Furthermore, a novel fast algorithm, FACA-DTRS, is devised based on the conclusion obtained in the validation of the ACA-DTRS algorithm. The performance of algorithms has been studied on some synthetic and real world data sets. The algorithm analysis and the results of comparison experiments show that the new method, without manual parameter specified in advance, is more valid to determine the number of clusters and more efficient in terms of time cost.