Integer linear programming models for constrained clustering

  • Authors:
  • Marianne Mueller;Stefan Kramer

  • Affiliations:
  • Technische Universität München, Institut für Informatik, Garching, Germany;Technische Universität München, Institut für Informatik, Garching, Germany

  • Venue:
  • DS'10 Proceedings of the 13th international conference on Discovery science
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We address the problem of building a clustering as a subset of a (possibly large) set of candidate clusters under user-defined constraints. In contrast to most approaches to constrained clustering, we do not constrain the way observations can be grouped into clusters, but the way candidate clusters can be combined into suitable clusterings. The constraints may concern the type of clustering (e.g., complete clusterings, overlapping or encompassing clusters) and the composition of clusterings (e.g., certain clusters excluding others). In the paper, we show that these constraints can be translated into integer linear programs, which can be solved by standard optimization packages. Our experiments with benchmark and real-world data investigates the quality of the clusterings and the running times depending on a variety of parameters.