Discovering tightly regulated and differentially expressed gene sets in whole genome expression data

  • Authors:
  • Chun Ye;Eleazar Eskin

  • Affiliations:
  • Bioinformatics Program, University of California San Diego, La Jolla, CA 92093-0404, USA;Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA 92093-0404, USA

  • Venue:
  • Bioinformatics
  • Year:
  • 2007

Quantified Score

Hi-index 3.84

Visualization

Abstract

Motivation: Recently, a new type of expression data is being collected which aims to measure the effect of genetic variation on gene expression in pathways. In these datasets, expression profiles are constructed for multiple strains of the same model organism under the same condition. The goal of analyses of these data is to find differences in regulatory patterns due to genetic variation between strains, often without a phenotype of interest in mind. We present a new method based on notions of tight regulation and differential expression to look for sets of genes which appear to be significantly affected by genetic variation. Results: When we use categorical phenotype information, as in the Alzheimer's and diabetes datasets, our method finds many of the same gene sets as gene set enrichment analysis. In addition, our notion of correlated gene sets allows us to focus our efforts on biological processes subjected to tight regulation. In murine hematopoietic stem cells, we are able to discover significant gene sets independent of a phenotype of interest. Some of these gene sets are associated with several blood-related phenotypes. Availability: The programs are available by request from the authors. Contact: cye@bioinf.ucsd.edu