Two-Way Analysis of High-Dimensional Collinear Data

  • Authors:
  • Ilkka Huopaniemi;Tommi Suvitaival;Janne Nikkilä;Matej Orešič;Samuel Kaski

  • Affiliations:
  • Department of Information and Computer Science, Helsinki University of Technology, Finland FI-02015;Department of Information and Computer Science, Helsinki University of Technology, Finland FI-02015;Department of Information and Computer Science, Helsinki University of Technology, Finland FI-02015 and Department of Basic Veterinary Sciences, (Division of Microbiology and Epidemiology), Facult ...;VTT Technical Research Centre of Finland, Espoo, Finland FIN-02044;Department of Information and Computer Science, Helsinki University of Technology, Finland FI-02015

  • Venue:
  • ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a Bayesian model for two-way ANOVA-type analysis of high-dimensional, small sample-size datasets with highly correlated groups of variables. Modern cellular measurement methods are a main application area; typically the task is differential analysis between diseased and healthy samples, complicated by additional covariates requiring a multi-way analysis. The main complication is the combination of high dimensionality and low sample size, which renders classical multivariate techniques useless. We introduce a hierarchical model which does dimensionality reduction by assuming that the input variables come in similarly-behaving groups, and performs an ANOVA-type decomposition for the set of reduced-dimensional latent variables. We apply the methods to study lipidomic profiles of a recent large-cohort human diabetes study.