Gene set analysis using principal components

  • Authors:
  • Isa Kemal Pakatci;Wei Wang;Leonard McMillan

  • Affiliations:
  • University of North Carolina, Chapel Hill, NC;University of North Carolina, Chapel Hill, NC;University of North Carolina, Chapel Hill, NC

  • Venue:
  • Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a new method for identifying gene sets associated with labeled samples, where the labels can be case versus control, or genotype differences. Existing approaches to this problem assume that variations observed within a group are due primarily to noise and they, therefore, look for significant mean shifts between groups. Biological evidence suggests variations can also result from the coordination of genes. Our method attempts to identify and assess the significance of changes in gene-gene correlation patterns. We model gene-gene correlations using principal component analysis and compare their significance to a baseline of a linear models generated by random permutations of the sample labels. Simulation results show that our method detects changes that are undetectable by Hotelling's T2 method. Its performance on real data is comparable to existing methods with the additional capability of detecting changes in gene-interactions between sample groups.