Dampster-Shafer evidence theory based multi-characteristics fusion for clustering evaluation

  • Authors:
  • Shihong Yue;Teresa Wu;Yamin Wang;Kai Zhang;Weixia Liu

  • Affiliations:
  • School of Electric Engineering and Automation, Tianjin University, Tianjin, China;School of Computing, Informatics, Decision Systems Engineering, Arizona State University Tempe, AZ;School of Electric Engineering and Automation, Tianjin University, Tianjin, China;School of Electric Engineering and Automation, Tianjin University, Tianjin, China;School of Electric Engineering and Automation, Tianjin University, Tianjin, China

  • Venue:
  • RSKT'10 Proceedings of the 5th international conference on Rough set and knowledge technology
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Clustering is a widely used unsupervised learning method to group data with similar characteristics. The performance of the clustering method can be in general evaluated through some validity indices. However, most validity indices are designed for the specific algorithms along with specific structure of data space. Moreover, these indices consist of a few within- and between- clustering distance functions. The applicability of these indices heavily relies on the correctness of combining these functions. In this research, we first summarize three common characteristics of any clustering evaluation: (1) the clustering outcome can be evaluated by a group of validity indices if some efficient validity indices are available, (2) the clustering outcome can be measured by an independent intra-cluster distance function and (3) the clustering out come can be measured by the neighborhood based functions. Considering the complementary and unstable natures among the clustering evaluation, we then apply Dampster-Shafter (D-S) Evidence Theory to fuse the three characteristics to generate a new index, termed fused Multiple Characteristic Indices (fMCI). The fMCI generally is capable to evaluate clustering outcomes of arbitrary clustering methods associated with more complex structures of data space. We conduct a number of experiments to demonstrate that the fMCI is applicable to evaluate different clustering algorithms on different datasets and the fMCI can achieve more accurate and robust clustering evaluation comparing to existing indices.