Using combinatorial optimization in model-based trimmed clustering with cardinality constraints

  • Authors:
  • María Teresa Gallegos;Gunter Ritter

  • Affiliations:
  • Faculty of Informatics and Mathematics, University of Passau, D-94030 Passau, Germany;Faculty of Informatics and Mathematics, University of Passau, D-94030 Passau, Germany

  • Venue:
  • Computational Statistics & Data Analysis
  • Year:
  • 2010

Quantified Score

Hi-index 0.03

Visualization

Abstract

Statistical clustering criteria with free scale parameters and unknown cluster sizes are inclined to create small, spurious clusters. To mitigate this tendency a statistical model for cardinality-constrained clustering of data with gross outliers is established, its maximum likelihood and maximum a posteriori clustering criteria are derived, and their consistency and robustness are analyzed. The criteria lead to constrained optimization problems that can be solved by using iterative, alternating trimming algorithms of k-means type. Each step in the algorithms requires the solution of a @l-assignment problem known from combinatorial optimization. The method allows one to estimate the numbers of clusters and outliers. It is illustrated with a synthetic data set and a real one.