Full regularization path for sparse principal component analysis

Authors:
Alexandre d'Aspremont;Francis R. Bach;Laurent El Ghaoui
Affiliations:
Princeton University, Princeton, NJ;Ecole des Mines de Paris, France;U. C. Berkeley, Berkeley, CA
Venue:
Proceedings of the 24th international conference on Machine learning
Year:
2007

Citing 3
Cited 8

Sparse Approximate Solutions to Linear Systems

SIAM Journal on Computing
Algorithm 805: computation and uses of the semidiscrete matrix decomposition

ACM Transactions on Mathematical Software (TOMS)
Generalized spectral bounds for sparse LDA

ICML '06 Proceedings of the 23rd international conference on Machine learning

Expectation-maximization for sparse and non-negative PCA

Proceedings of the 25th international conference on Machine learning
Optimal Solutions for Sparse Principal Component Analysis

The Journal of Machine Learning Research
Discovering Sparse Functional Brain Networks Using Group Replicator Dynamics (GRD)

IPMI '09 Proceedings of the 21st International Conference on Information Processing in Medical Imaging
Approximations for the isoperimetric and spectral profile of graphs and related parameters

Proceedings of the forty-second ACM symposium on Theory of computing
Low-Rank Optimization on the Cone of Positive Semidefinite Matrices

SIAM Journal on Optimization
A novel stability based feature selection framework for k-means clustering

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
Optimizing over the growing spectrahedron

ESA'12 Proceedings of the 20th Annual European conference on Algorithms
Feature selection for k-means clustering stability: theoretical analysis and an algorithm

Data Mining and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

Given a sample covariance matrix, we examine the problem of maximizing the variance explained by a particular linear combination of the input variables while constraining the number of nonzero coefficients in this combination. This is known as sparse principal component analysis and has a wide array of applications in machine learning and engineering. We formulate a new semidefinite relaxation to this problem and derive a greedy algorithm that computes a full set of good solutions for all numbers of non zero coefficients, with complexity O(n3), where n is the number of variables. We then use the same relaxation to derive sufficient conditions for global optimality of a solution, which can be tested in O(n3). We show on toy examples and biological data that our algorithm does provide globally optimal solutions in many cases.