Comparing partitions by subset similarities

  • Authors:
  • Thomas A. Runkler

  • Affiliations:
  • Siemens Corporate Technology, Muenchen, Germany

  • Venue:
  • IPMU'10 Proceedings of the Computational intelligence for knowledge-based systems design, and 13th international conference on Information processing and management of uncertainty
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Comparing partitions is an important issue in classification and clustering when comparing results from different methods, parameters, or initializations. A well-cestablished method for comparing partitions is the Rand index but this index is suitable for crisp partitions only. Recently, the Hüllermeier-Rifqi index was introduced which is a generalization of the Rand index to fuzzy partitions. In this paper we introduce a new approach to comparing partitions based on the similarities of their clusters in the sense of set similarity. All three indices, Rand, Hüllermeier-Rifqi, and subset similarity, are reflexive, invariant against row permutations, and invariant against additional empty subsets. The subset similarity index is not a generalization of the Rand index, but produces similar values. Subset similarity yields more intuitive similarities than Hüllermeier-Rifqi when comparing crisp and fuzzy partitions, and yields smoother nonlinear transitions. Finally, the subset similarity index has a lower computational complexity than the Hüllermeier-Rifqi index for large numbers of objects.