Genetic algorithm-based feature selection in high-resolution NMR spectra
Expert Systems with Applications: An International Journal
Discovery of metabolite features for the modelling and analysis of high-resolution NMR spectra
International Journal of Data Mining and Bioinformatics
Linear-mixed effects models for feature selection in high-dimensional NMR spectra
Expert Systems with Applications: An International Journal
A Weighted Principal Component Analysis and Its Application to Gene Expression Data
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Hi-index | 3.84 |
Motivation: Metabolomics datasets are generally large and complex. Using principal component analysis (PCA), a simplified view of the variation in the data is obtained. The PCA model can be interpreted and the processes underlying the variation in the data can be analysed. In metabolomics, often a priori information is present about the data. Various forms of this information can be used in an unsupervised data analysis with weighted PCA (WPCA). A WPCA model will give a view on the data that is different from the view obtained using PCA, and it will add to the interpretation of the information in a metabolomics dataset. Results: A method is presented to translate spectra of repeated measurements into weights describing the experimental error. These weights are used in the data analysis with WPCA. The WPCA model will give a view on the data where the non-uniform experimental error is accounted for. Therefore, the WPCA model will focus more on the natural variation in the data. Availability: M-files for MATLAB for the algorithm used in this research are available at http://www-its.chem.uva.nl/research/pac/Software/pcaw.zip