Principal component analysis for data containing outliers and missing elements

  • Authors:
  • Sven Serneels;Tim Verdonck

  • Affiliations:
  • ChemometriX Group, Department of Chemistry, University of Antwerp, Belgium;Agoras Group, Department of Mathematics and Informatics, University of Antwerp, Belgium

  • Venue:
  • Computational Statistics & Data Analysis
  • Year:
  • 2008

Quantified Score

Hi-index 0.03

Visualization

Abstract

Two approaches are presented to perform principal component analysis (PCA) on data which contain both outlying cases and missing elements. At first an eigendecomposition of a covariance matrix which can deal with such data is proposed, but this approach is not fit for data where the number of variables exceeds the number of cases. Alternatively, an expectation robust (ER) algorithm is proposed so as to adapt the existing methodology for robust PCA to data containing missing elements. According to an extensive simulation study, the ER approach performs well for all data sizes concerned. Using simulations and an example, it is shown that by virtue of the ER algorithm, the properties of the existing methods for robust PCA carry through to data with missing elements.