Computational selection of distinct class- and subclass-specific gene expression signatures

  • Authors:
  • Pierre R. Bushel;Hisham K. Hamadeh;Lee Bennett;James Green;Alan Ableson;Stephen Misener;Cynthia A. Afshari;Richard S. Paules

  • Affiliations:
  • National Institute of Environmental Health Sciences, P.O. Box 12233, Research Triangle Park, NC;National Institute of Environmental Health Sciences, P.O. Box 12233, Research Triangle Park, NC and Amgen Inc., One Amgen Center Drive, Thousand Oaks, CA;National Institute of Environmental Health Sciences, P.O. Box 12233, Research Triangle Park, NC;Molecular Mining Corporation, Kingston, Ont., Canada;Molecular Mining Corporation, Kingston, Ont., Canada;Molecular Mining Corporation, Kingston, Ont., Canada;National Institute of Environmental Health Sciences, P.O. Box 12233, Research Triangle Park, NC and Amgen Inc., One Amgen Center Drive, Thousand Oaks, CA;National Institute of Environmental Health Sciences, P.O. Box 12233, Research Triangle Park, NC

  • Venue:
  • Journal of Biomedical Informatics
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this investigation we used statistical methods to select genes with expression profiles that partition classes and subclasses of biological samples. Gene expression data corresponding to liver samples from rats treated for 24 h with an enzyme inducer (phenobarbital) or a peroxisome proliferator (clofibrate, gemfibrozil or Wyeth 14,643) were subjected to a modified Z-score test to identify gene outliers and a binomial distribution to reduce the probability of detecting genes as differentially expressed by chance. Hierarchical clustering of 238 statistically valid differentially expressed genes partitioned class-specific gene expression signatures into groups that clustered samples exposed to the enzyme inducer or to peroxisome proliferators. Using analysis of variance (ANOVA) and linear discriminant analysis methods we identified single genes as well as coupled gene expression profiles that separated the phenobarbital from the peroxisome proliferator treated samples and discerned the fibrate (gemfibrozil and clofibrate) subclass of peroxisome proliferators. A comparison of genes ranked by ANOVA with genes assessed as significant by mixed linear models analysis [J. Comput. Biol. 8 (2001) 625] or ranked by information gain revealed good congruence with the top 10 genes from each statistical method in the contrast between phenobarbital and peroxisome proliferators expression profiles. We propose building upon a classification regimen comprised of analysis of replicate data, outlier diagnostics and gene selection procedures to utilize cDNA microarray data to categorize subclasses of samples exposed to pharmacologic agents.