Learning to cluster using local neighborhood structure

  • Authors:
  • Rómer Rosales;Kannan Achan;Brendan Frey

  • Affiliations:
  • University of Toronto, Toronto, ON, Canada;University of Toronto, Toronto, ON, Canada;University of Toronto, Toronto, ON, Canada

  • Venue:
  • ICML '04 Proceedings of the twenty-first international conference on Machine learning
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper introduces an approach for clustering/classification which is based on the use of local, high-order structure present in the data. For some problems, this local structure might be more relevant for classification than other measures of point similarity used by popular unsupervised and semi-supervised clustering methods. Under this approach, changes in the class label are associated to changes in the local properties of the data. Using this idea, we also pursue to learn how to cluster given examples of clustered data (including from different datasets). We make these concepts formal by presenting a probability model that captures their fundamentals and show that in this setting, learning to cluster is a well defined and tractable task. Based on probabilistic inference methods, we then present an algorithm for computing the posterior probability distribution of class labels for each data point. Experiments in the domain of spatial grouping and functional gene classification are used to illustrate and test these concepts.