Value, cost, and sharing: open issues in constrained clustering

  • Authors:
  • Kiri L. Wagstaff

  • Affiliations:
  • Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA

  • Venue:
  • KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Clustering is an important tool for data mining, since it can identify major patterns or trends without any supervision (labeled data). Over the past five years, semi-supervised (constrained) clustering methods have become very popular. These methods began with incorporating pairwise constraints and have developed into more general methods that can learn appropriate distance metrics. However, several important open questions have arisen about which constraints are most useful, how they can be actively acquired, and when and how they should be propagated to neighboring points. This position paper describes these open questions and suggests future directions for constrained clustering research.