On the number of principal components: A test of dimensionality based on measurements of similarity between matrices

  • Authors:
  • Sté/phane Dray

  • Affiliations:
  • Laboratoire de Biomé/trie et Biologie Evolutive, Université/ de Lyon/ Université/ Lyon 1/ CNRS/ UMR 5558, 43 boulevard du 11 novembre 1918, Villeurbanne F-69622, France

  • Venue:
  • Computational Statistics & Data Analysis
  • Year:
  • 2008

Quantified Score

Hi-index 0.03

Visualization

Abstract

An important problem in principal component analysis (PCA) is the estimation of the correct number of components to retain. PCA is most often used to reduce a set of observed variables to a new set of variables of lower dimensionality. The choice of this dimensionality is a crucial step for the interpretation of results or subsequent analyses, because it could lead to a loss of information (underestimation) or the introduction of random noise (overestimation). New techniques are proposed to evaluate the dimensionality in PCA. They are based on similarity measurements, singular value decomposition and permutation procedures. A simulation study is conducted to evaluate the relative merits of the proposed approaches. Results showed that one method based on the RV coefficient is very accurate and seems to be more efficient than other existing approaches.