Supervised Clustering " Algorithms and Benefits

  • Authors:
  • Christoph F. Eick;Nidal Zeidat;Zhenghong Zhao

  • Affiliations:
  • University of Houston;University of Houston;University of Houston

  • Venue:
  • ICTAI '04 Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper centers on a novel data mining technique we term supervised clustering. Unlike traditional clustering, supervised clustering assumes that the examples are classified and has the goal of identifying class-uniform clusters that have high probability densities. Four representative-based algorithms for supervised clustering are introduced: a greedy algorithm with random restart, named SRIDHCR, that seeks for solutions by inserting and removing single objects from the current solution, SPAM (a variation of the clustering algorithm PAM), an evolutionary computing algorithm named SCEC, and a fast medoid-based top-down splitting algorithm, named TDS. The four algorithms were evaluated using a benchmark consisting of four UCI machine learning data sets. In general, it seems that "greedy" algorithms, such as SPAM, SRIDHCR, and TDS, do not perform particularly well for supervised clustering and seem to terminate prematurely too often. We also briefly describe the applications of supervised clustering.