Integrating constraints and metric learning in semi-supervised clustering

  • Authors:
  • Mikhail Bilenko;Sugato Basu;Raymond J. Mooney

  • Affiliations:
  • University of Texas at Austin, Austin, TX;University of Texas at Austin, Austin, TX;University of Texas at Austin, Austin, TX

  • Venue:
  • ICML '04 Proceedings of the twenty-first international conference on Machine learning
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Semi-supervised clustering employs a small amount of labeled data to aid unsupervised learning. Previous work in the area has utilized supervised data in one of two approaches: 1) constraint-based methods that guide the clustering algorithm towards a better grouping of the data, and 2) distance-function learning methods that adapt the underlying similarity metric used by the clustering algorithm. This paper provides new methods for the two approaches as well as presents a new semi-supervised clustering algorithm that integrates both of these techniques in a uniform, principled framework. Experimental results demonstrate that the unified approach produces better clusters than both individual approaches as well as previously proposed semi-supervised clustering algorithms.