Using combinatorial optimization in model-based trimmed clustering with cardinality constraints

Authors:
María Teresa Gallegos;Gunter Ritter
Affiliations:
Faculty of Informatics and Mathematics, University of Passau, D-94030 Passau, Germany;Faculty of Informatics and Mathematics, University of Passau, D-94030 Passau, Germany
Venue:
Computational Statistics & Data Analysis
Year:
2010

Citing 14
Cited 5

Transportation problems which can be solved by the use of Hirsch-paths for the dual problems

Mathematical Programming: Series A and B
Fast algorithms for bipartite network flow

SIAM Journal on Computing
A faster strongly polynomial minimum cost flow algorithm

STOC '88 Proceedings of the twentieth annual ACM symposium on Theory of computing
Faster scaling algorithms for network problems

SIAM Journal on Computing
Finding minimum-cost circulations by successive approximation

Mathematics of Operations Research
Geometric algorithms for a minimum cost assignment problem

SCG '91 Proceedings of the seventh annual symposium on Computational geometry
An efficient transportation algorithm for automatic chromosome karyotyping

Pattern Recognition Letters
Improved Algorithms for Bipartite Network Flow

SIAM Journal on Computing
Efficient Algorithms for the Hitchcock Transportation Problem

SIAM Journal on Computing
Geometric algorithms for the minimum cost assignment problem

Random Structures & Algorithms
Theoretical Improvements in Algorithmic Efficiency for Network Flow Problems

Journal of the ACM (JACM)
Some computational issues in cluster analysis with no a priori metric

Computational Statistics & Data Analysis
A fast algorithm for the minimum covariance determinant estimator

Technometrics
The complexity of computing the MCD-estimator

Theoretical Computer Science

Editorial: Second special issue on statistical algorithms and software

Computational Statistics & Data Analysis
Exploring the number of groups in robust model-based clustering

Statistics and Computing
Robust joint modeling of mean and dispersion through trimming

Computational Statistics & Data Analysis
A fast algorithm for robust constrained clustering

Computational Statistics & Data Analysis
Strong consistency of k-parameters clustering

Journal of Multivariate Analysis

Quantified Score

Hi-index	0.03

Visualization

Abstract

Statistical clustering criteria with free scale parameters and unknown cluster sizes are inclined to create small, spurious clusters. To mitigate this tendency a statistical model for cardinality-constrained clustering of data with gross outliers is established, its maximum likelihood and maximum a posteriori clustering criteria are derived, and their consistency and robustness are analyzed. The criteria lead to constrained optimization problems that can be solved by using iterative, alternating trimming algorithms of k-means type. Each step in the algorithms requires the solution of a @l-assignment problem known from combinatorial optimization. The method allows one to estimate the numbers of clusters and outliers. It is illustrated with a synthetic data set and a real one.