Non-linear PCA: a missing data approach

Authors:
Matthias Scholz;Fatma Kaplan;Charles L. Guy;Joachim Kopka;Joachim Selbig
Affiliations:
Max Planck Institute of Molecular Plant Physiology Potsdam, Germany;University of Florida, Plant Molecular and Cellular Biology Program, Department of Environmental Horticulture Gainesville, Florida 32611, USA;University of Florida, Plant Molecular and Cellular Biology Program, Department of Environmental Horticulture Gainesville, Florida 32611, USA;Max Planck Institute of Molecular Plant Physiology Potsdam, Germany;Max Planck Institute of Molecular Plant Physiology Potsdam, Germany
Venue:
Bioinformatics
Year:
2005

Citing 0
Cited 7

AVEDA: Statistical Tests for Finding Interesting Visualisations

KES '09 Proceedings of the 13th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems: Part I
Analysing periodic phenomena by circular PCA

BIRD'07 Proceedings of the 1st international conference on Bioinformatics research and development
Imputation of missing values for compositional data using classical and robust methods

Computational Statistics & Data Analysis
Impact of missing value imputation on classification for DNA microarray gene expression data: a model-based study

EURASIP Journal on Bioinformatics and Systems Biology
Visualisation of test coverage for conformance tests of low level communication protocols

KES'10 Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part II
Nonlinear enhancement of noisy speech, using continuous attractor dynamics formed in recurrent neural networks

Neurocomputing
Validation of Nonlinear PCA

Neural Processing Letters

Quantified Score

Hi-index	3.84

Visualization

Abstract

Motivation: Visualizing and analysing the potential non-linear structure of a dataset is becoming an important task in molecular biology. This is even more challenging when the data have missing values. Results: Here, we propose an inverse model that performs non-linear principal component analysis (NLPCA) from incomplete datasets. Missing values are ignored while optimizing the model, but can be estimated afterwards. Results are shown for both artificial and experimental datasets. In contrast to linear methods, non-linear methods were able to give better missing value estimations for non-linear structured data. Application: We applied this technique to a time course of metabolite data from a cold stress experiment on the model plant Arabidopsis thaliana, and could approximate the mapping function from any time point to the metabolite responses. Thus, the inverse NLPCA provides greatly improved information for better understanding the complex response to cold stress. Contact: scholz@mpimp-golm.mpg.de