An efficient algorithm for maximal margin clustering

  • Authors:
  • Jiming Peng;Lopamudra Mukherjee;Vikas Singh;Dale Schuurmans;Linli Xu

  • Affiliations:
  • Department of Industrial and Enterprise Systems Engineering, University of Illinois at Urbana-Champaign, Urbana, USA;Department of Mathematical and Computer Sciences, University of Wisconsin---Whitewater, Whitewater, USA;Departments of Biostatistics and Medical Informatics and Computer Sciences, University of Wisconsin---Madison, Madison, USA;Department of Computing Science, University of Alberta, Edmonton, Canada;School of Computer Science and Technology, University of Science and Technology of China, Hefei, China

  • Venue:
  • Journal of Global Optimization
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Maximal margin based frameworks have emerged as a powerful tool for supervised learning. The extension of these ideas to the unsupervised case, however, is problematic since the underlying optimization entails a discrete component. In this paper, we first study the computational complexity of maximal hard margin clustering and show that the hard margin clustering problem can be precisely solved in O(n d+2) time where n is the number of the data points and d is the dimensionality of the input data. However, since it is well known that many datasets commonly `express' themselves primarily in far fewer dimensions, our interest is in evaluating if a careful use of dimensionality reduction can lead to practical and effective algorithms. We build upon these observations and propose a new algorithm that gradually increases the number of features used in the separation model in each iteration, and analyze the convergence properties of this scheme. We report on promising numerical experiments based on a `truncated' version of this approach. Our experiments indicate that for a variety of datasets, good solutions equivalent to those from other existing techniques can be obtained in significantly less time.