Large-scale SVD and manifold learning

  • Authors:
  • Ameet Talwalkar;Sanjiv Kumar;Mehryar Mohri;Henry Rowley

  • Affiliations:
  • University of California, Berkeley, Division of Computer Science, Berkeley, CA;Google Research, New York, NY;Courant Institute and Google Research, New York, NY;Google Research, Mountain View, CA

  • Venue:
  • The Journal of Machine Learning Research
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper examines the efficacy of sampling-based low-rank approximation techniques when applied to large dense kernel matrices. We analyze two common approximate singular value decomposition techniques, namely the Nyström and Column sampling methods. We present a theoretical comparison between these two methods, provide novel insights regarding their suitability for various tasks and present experimental results that support our theory. Our results illustrate the relative strengths of each method. We next examine the performance of these two techniques on the large-scale task of extracting low-dimensional manifold structure given millions of high-dimensional face images. We address the computational challenges of non-linear dimensionality reduction via Isomap and Laplacian Eigenmaps, using a graph containing about 18 million nodes and 65 million edges. We present extensive experiments on learning low-dimensional embeddings for two large face data sets: CMU-PIE (35 thousand faces) and a web data set (18 million faces). Our comparisons show that the Nyström approximation is superior to the Column sampling method for this task. Furthermore, approximate Isomap tends to perform better than Laplacian Eigenmaps on both clustering and classification with the labeled CMU-PIE data set.