An efficient algorithm for maximal margin clustering

Authors:
Jiming Peng;Lopamudra Mukherjee;Vikas Singh;Dale Schuurmans;Linli Xu
Affiliations:
Department of Industrial and Enterprise Systems Engineering, University of Illinois at Urbana-Champaign, Urbana, USA;Department of Mathematical and Computer Sciences, University of Wisconsin---Whitewater, Whitewater, USA;Departments of Biostatistics and Medical Informatics and Computer Sciences, University of Wisconsin---Madison, Madison, USA;Department of Computing Science, University of Alberta, Edmonton, Canada;School of Computer Science and Technology, University of Science and Technology of China, Hefei, China
Venue:
Journal of Global Optimization
Year:
2012

Citing 16
Cited 0

The nature of statistical learning theory

The nature of statistical learning theory
Semidefinite programming

SIAM Review
Semi-supervised support vector machines

Proceedings of the 1998 conference on Advances in neural information processing systems II
Normalized Cuts and Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Database-friendly random projections

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond

Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
k-Plane Clustering

Journal of Global Optimization
A Decision Criterion for the Optimal Number of Clusters in Hierarchical Clustering

Journal of Global Optimization
Kernel k-means: spectral clustering and normalized cuts

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
A Global Optimization RLT-based Approach for Solving the Hard Clustering Problem

Journal of Global Optimization
A Global Optimization RLT-based Approach for Solving the Fuzzy Clustering Problem

Journal of Global Optimization
Fast SDP Relaxations of Graph Cut Clustering, Transduction, and Other Combinatorial Problems

The Journal of Machine Learning Research
Maximum margin clustering made practical

Proceedings of the 24th international conference on Machine learning
An optimization-based approach for data classification

Optimization Methods & Software - Systems Analysis, Optimization and Data Mining in Biomedicine
Unsupervised and semi-supervised multi-class support vector machines

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Survey of clustering algorithms

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

Maximal margin based frameworks have emerged as a powerful tool for supervised learning. The extension of these ideas to the unsupervised case, however, is problematic since the underlying optimization entails a discrete component. In this paper, we first study the computational complexity of maximal hard margin clustering and show that the hard margin clustering problem can be precisely solved in O(n d+2) time where n is the number of the data points and d is the dimensionality of the input data. However, since it is well known that many datasets commonly `express' themselves primarily in far fewer dimensions, our interest is in evaluating if a careful use of dimensionality reduction can lead to practical and effective algorithms. We build upon these observations and propose a new algorithm that gradually increases the number of features used in the separation model in each iteration, and analyze the convergence properties of this scheme. We report on promising numerical experiments based on a `truncated' version of this approach. Our experiments indicate that for a variety of datasets, good solutions equivalent to those from other existing techniques can be obtained in significantly less time.