Computational experience on four algorithms for the hard clustering problem
Pattern Recognition Letters
Cluster analysis and mathematical programming
Mathematical Programming: Series A and B - Special issue: papers from ismp97, the 16th international symposium on mathematical programming, Lausanne EPFL
An Interior Point Algorithm for Minimum Sum-of-Squares Clustering
SIAM Journal on Scientific Computing
Clustering Algorithms
A Global Optimization RLT-based Approach for Solving the Hard Clustering Problem
Journal of Global Optimization
NP-hardness of Euclidean sum-of-squares clustering
Machine Learning
An improved column generation algorithm for minimum sum-of-squares clustering
Mathematical Programming: Series A and B
Hi-index | 0.00 |
Minimum sum-of-squares clustering consists in partitioning a given set of n points into c clusters in order to minimize the sum of squared distances from the points to the centroid of their cluster. Recently, Sherali and Desai (JOGO, 2005) proposed a reformulation-linearization based branch-and-bound algorithm for this problem, claiming to solve instances with up to 1,000 points. In this paper, their algorithm is investigated in further detail, reproducing some of their computational experiments. However, our computational times turn out to be drastically larger. Indeed, for two data sets from the literature only instances with up to 20 points could be solved in less than 10 h of computer time. Possible reasons for this discrepancy are discussed. The effect of a symmetry breaking rule due to Plastria (EJOR, 2002) and of the introduction of valid inequalities of the convex hull of points in two dimensions which may belong to each cluster is also explored.