Ensemble clustering using semidefinite programming with applications

  • Authors:
  • Vikas Singh;Lopamudra Mukherjee;Jiming Peng;Jinhui Xu

  • Affiliations:
  • Department of Biostatistics & Medical Informatics, University of Wisconsin---Madison, Madison, USA;Department of Mathematics and Computer Science, University of Wisconsin---Whitewater, Whitewater, USA;Industrial and Enterprise Systems Engineering, University of Illinois at Urbana-Champaign, Urbana-Champaign, USA;Department of Computer Sci. and Eng., The State University of New York at Buffalo, Buffalo, USA

  • Venue:
  • Machine Learning
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we study the ensemble clustering problem, where the input is in the form of multiple clustering solutions. The goal of ensemble clustering algorithms is to aggregate the solutions into one solution that maximizes the agreement in the input ensemble. We obtain several new results for this problem. Specifically, we show that the notion of agreement under such circumstances can be better captured using a 2D string encoding rather than a voting strategy, which is common among existing approaches. Our optimization proceeds by first constructing a non-linear objective function which is then transformed into a 0-1 Semidefinite program (SDP) using novel convexification techniques. This model can be subsequently relaxed to a polynomial time solvable SDP. In addition to the theoretical contributions, our experimental results on standard machine learning and synthetic datasets show that this approach leads to improvements not only in terms of the proposed agreement measure but also the existing agreement measures based on voting strategies. In addition, we identify several new application scenarios for this problem. These include combining multiple image segmentations and generating tissue maps from multiple-channel Diffusion Tensor brain images to identify the underlying structure of the brain.