A Comparison of the Stability Characteristics of Some Graph Theoretic Clustering Methods

  • Authors:
  • Vijay V. Raghavan;C. T. Yu

  • Affiliations:
  • Department of Computer Science, University of Regina, Regina, Sask., Canada.;Department of Information Engineering, University of Illinois at Chicago Circle, Chicago, IL 60638.

  • Venue:
  • IEEE Transactions on Pattern Analysis and Machine Intelligence
  • Year:
  • 1981

Quantified Score

Hi-index 0.14

Visualization

Abstract

Assessing the stability of a clustering method involves the measurement of the extent to which the generated clusters are affected by perturbations in the input data. A measure which specifies the disturbance in a set of clusters as the minimum number of operations required to restore the set of modified clusters to the original ones is adopted. A number of well-known graph theoretic clustering methods are compared in terms of their stability as determined by this measure. Specifically, it is shown that among the clustering methods in any of several families of graph theoretic methods, clusters defined as the connected components are the most stable and the clusters specified as the maximal complete subgraphs are the least stable. Furthermore, as one proceeds from the method producing the most narrow clusters (maximal complete subgraphs) to those producing relatively broader clusters, the clustering process is shown to remain at least as stable as any method in the previous stages. Finally, the lower and the upper bounds for the measure of stability, when clusters are defined as the connected components, are derived.