A Direct Formulation for Sparse PCA Using Semidefinite Programming

  • Authors:
  • Alexandre d'Aspremont;Laurent El Ghaoui;Michael I. Jordan;Gert R. G. Lanckriet

  • Affiliations:
  • aspremon@princeton.edu;elghaoui@eecs.berkeley.edu;jordan@cs.berkeley.edu;gert@ece.ucsd.edu

  • Venue:
  • SIAM Review
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Given a covariance matrix, we consider the problem of maximizing the variance explained by a particular linear combination of the input variables while constraining the number of nonzero coefficients in this combination. This problem arises in the decomposition of a covariance matrix into sparse factors or sparse principal component analysis (PCA), and has wide applications ranging from biology to finance. We use a modification of the classical variational representation of the largest eigenvalue of a symmetric matrix, where cardinality is constrained, and derive a semidefinite programming-based relaxation for our problem. We also discuss Nesterov's smooth minimization technique applied to the semidefinite program arising in the semidefinite relaxation of the sparse PCA problem. The method has complexity $O(n^4 \sqrt{\log(n)}/\epsilon)$, where $n$ is the size of the underlying covariance matrix and $\epsilon$ is the desired absolute accuracy on the optimal value of the problem.