Analysis of variance components in gene expression data

  • Authors:
  • James J. Chen;Robert R. Delongchamp;Chen-An Tsai;Huey-Miin Hsueh;Frank Sistare;Karol L. Thompson;Varsha G. Desai;James C. Fuscoe

  • Affiliations:
  • Division of Biometry and Risk Assessment,;Division of Biometry and Risk Assessment,;Division of Biometry and Risk Assessment,;Department of Statistics, National Chengchi University, Taipei, Taiwan;Division of Applied Pharmacology Research, Center for Drug Evaluation and Research, Food and Drug Administration, Laurel, MD 20708, USA;Division of Applied Pharmacology Research, Center for Drug Evaluation and Research, Food and Drug Administration, Laurel, MD 20708, USA;Center for Functional Genomics, Division of Genetic and Reproductive Toxicology, National Center for Toxicological Research, Food and Drug Administration, Jefferson, AR 72079, USA,;Center for Functional Genomics, Division of Genetic and Reproductive Toxicology, National Center for Toxicological Research, Food and Drug Administration, Jefferson, AR 72079, USA,

  • Venue:
  • Bioinformatics
  • Year:
  • 2004

Quantified Score

Hi-index 3.84

Visualization

Abstract

Motivation: A microarray experiment is a multi-step process, and each step is a potential source of variation. There are two major sources of variation: biological variation and technical variation. This study presents a variance-components approach to investigating animal-to-animal, between-array, within-array and day-to-day variations for two data sets. The first data set involved estimation of technical variances for pooled control and pooled treated RNA samples. The variance components included between-array, and two nested within-array variances: between-section (the upper- and lower-sections of the array are replicates) and within-section (two adjacent spots of the same gene are printed within each section). The second experiment was conducted on four different weeks. Each week there were reference and test samples with a dye-flip replicate in two hybridization days. The variance components included week-to-week, animal-to-animal and between-array and within-array variances. Results: We applied the linear mixed-effects model to quantify different sources of variation. In the first data set, we found that the between-array variance is greater than the between-section variance, which, in turn, is greater than the within-section variance. In the second data set, for the reference samples, the week-to-week variance is larger than the between-array variance, which, in turn, is slightly larger than the within-array variance. For the test samples, the week-to-week variance has the largest variation. The animal-to-animal variance is slightly larger than the between-array and within-array variances. However, in a gene-by-gene analysis, the animal-to-animal variance is smaller than the between-array variance in four out of five housekeeping genes. In summary, the largest variation observed is the week-to-week effect. Another important source of variability is the animal-to-animal variation. Finally, we describe the use of variance-component estimates to determine optimal numbers of animals, arrays per animal and sections per array in planning microarray experiments.