Effective PCA for high-dimension, low-sample-size data with noise reduction via geometric representations

  • Authors:
  • Kazuyoshi Yata;Makoto Aoshima

  • Affiliations:
  • -;-

  • Venue:
  • Journal of Multivariate Analysis
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this article, we propose a new estimation methodology to deal with PCA for high-dimension, low-sample-size (HDLSS) data. We first show that HDLSS datasets have different geometric representations depending on whether a @r-mixing-type dependency appears in variables or not. When the @r-mixing-type dependency appears in variables, the HDLSS data converge to an n-dimensional surface of unit sphere with increasing dimension. We pay special attention to this phenomenon. We propose a method called the noise-reduction methodology to estimate eigenvalues of a HDLSS dataset. We show that the eigenvalue estimator holds consistency properties along with its limiting distribution in HDLSS context. We consider consistency properties of PC directions. We apply the noise-reduction methodology to estimating PC scores. We also give an application in the discriminant analysis for HDLSS datasets by using the inverse covariance matrix estimator induced by the noise-reduction methodology.