Fast regularized canonical correlation analysis

  • Authors:
  • Raul Cruz-Cano;Mei-Ling Ting Lee

  • Affiliations:
  • -;-

  • Venue:
  • Computational Statistics & Data Analysis
  • Year:
  • 2014

Quantified Score

Hi-index 0.03

Visualization

Abstract

Canonical correlation analysis is a popular statistical method for the study of the correlations between two sets of variables. Finding the canonical correlations between these datasets requires the inversion of their corresponding sample correlation matrices. When the number of variables is large compared to the number of experimental units it is impossible to calculate the inverse of these matrices directly and therefore it is necessary to add a multiple of the identity matrix to them. This procedure is known as regularization. In this paper we present an alternative method to the existing regularization algorithm. The proposed method is based on the estimates of the correlation matrices which minimize the mean squared error risk function. The solution of this optimization problem can be found analytically and consists of a small set of computationally inexpensive equations. We also present material which shows that the proposed method is more stable and provides more accurate results than the standard regularized canonical correlation method. Finally, the application of our original method to NCI-60 microRNA cancer data proves that it can deliver useful insights in study cases which involve hundreds of variables.