Two-way analysis of high-dimensional collinear data

  • Authors:
  • Ilkka Huopaniemi;Tommi Suvitaival;Janne Nikkilä;Matej Orešič;Samuel Kaski

  • Affiliations:
  • Department of Information and Computer Science, Helsinki University of Technology (TKK), Espoo, Finland 02015;Department of Information and Computer Science, Helsinki University of Technology (TKK), Espoo, Finland 02015;Department of Information and Computer Science, Helsinki University of Technology (TKK), Espoo, Finland 02015 and Department of Basic Veterinary Sciences (Division of Microbiology and Epidemiology ...;VTT Technical Research Centre of Finland (VTT), Espoo, Finland 02044;Department of Information and Computer Science, Helsinki University of Technology (TKK), Espoo, Finland 02015

  • Venue:
  • Data Mining and Knowledge Discovery
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a Bayesian model for two-way ANOVA-type analysis of high-dimensional, small sample-size datasets with highly correlated groups of variables. Modern cellular measurement methods are a main application area; typically the task is differential analysis between diseased and healthy samples, complicated by additional covariates requiring a multi-way analysis. The main complication is the combination of high dimensionality and low sample size, which renders classical multivariate techniques useless. We introduce a hierarchical model which does dimensionality reduction by assuming that the input variables come in similarly-behaving groups, and performs an ANOVA-type decomposition for the set of reduced-dimensional latent variables. We apply the methods to study lipidomic profiles of a recent large-cohort human diabetes study.