Optimal algorithms for approximate clustering
STOC '88 Proceedings of the twentieth annual ACM symposium on Theory of computing
Machine Learning
Cluster analysis and mathematical programming
Mathematical Programming: Series A and B - Special issue: papers from ismp97, the 16th international symposium on mathematical programming, Lausanne EPFL
Computers and Operations Research
ACM Computing Surveys (CSUR)
An Interior Point Algorithm for Minimum Sum-of-Squares Clustering
SIAM Journal on Scientific Computing
Learning mixtures of arbitrary gaussians
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
A D. C. Optimization Algorithm for Solving the Trust-Region Subproblem
SIAM Journal on Optimization
Mathematical Programming in Data Mining
Data Mining and Knowledge Discovery
Solving a Class of Linearly Constrained Indefinite QuadraticProblems by D.C. Algorithms
Journal of Global Optimization
Knowledge Acquisition Via Incremental Conceptual Clustering
Machine Learning
Feature Selection via Concave Minimization and Support Vector Machines
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Leveraging the margin more carefully
ICML '04 Proceedings of the twenty-first international conference on Machine learning
A Global Optimization RLT-based Approach for Solving the Hard Clustering Problem
Journal of Global Optimization
Trading convexity for scalability
ICML '06 Proceedings of the 23rd international conference on Machine learning
A new efficient algorithm based on DC programming and DCA for clustering
Journal of Global Optimization
A survey of kernel and spectral methods for clustering
Pattern Recognition
Modified global k-means algorithm for minimum sum-of-squares clustering problems
Pattern Recognition
Minimum sum-of-squares clustering by DC programming and DCA
ICIC'09 Proceedings of the Intelligent computing 5th international conference on Emerging intelligent computing technology and applications
Exact penalty and error bounds in DC programming
Journal of Global Optimization
Binary classification via spherical separator by DC programming and DCA
Journal of Global Optimization
Hi-index | 0.01 |
The purpose of this paper is to develop new efficient approaches based on DC (Difference of Convex functions) programming and DCA (DC Algorithm) to perform clustering via minimum sum-of-squares Euclidean distance. We consider the two most widely used models for the so-called Minimum Sum-of-Squares Clustering (MSSC in short) that are a bilevel programming problem and a mixed integer program. Firstly, the mixed integer formulation of MSSC is carefully studied and is reformulated as a continuous optimization problem via a new result on exact penalty technique in DC programming. DCA is then investigated to the resulting problem. Secondly, we introduce a Gaussian kernel version of the bilevel programming formulation of MSSC, named GKMSSC. The GKMSSC problem is formulated as a DC program for which a simple and efficient DCA scheme is developed. A regularization technique is investigated for exploiting the nice effect of DC decomposition and a simple procedure for finding good starting points of DCA is developed. The proposed DCA schemes are original and very inexpensive because they amount to computing, at each iteration, the projection of points onto a simplex and/or onto a ball, and/or onto a box, which are all determined in the explicit form. Numerical results on real word datasets show the efficiency, the scalability of DCA and its great superiority with respect to k-means and kernel k-means, standard methods for clustering.