A mixture model approach for the analysis of small exploratory microarray experiments

  • Authors:
  • W. M. Muir;G. J. M. Rosa;B. R. Pittendrigh;Z. Xu;S. D. Rider;M. Fountain;J. Ogas

  • Affiliations:
  • Department of Animal Sciences, Purdue University, W. Lafayette, IN 47907, United States;Department of Dairy Science, University of Wisconsin, Madison, WI 53706, United States;Department of Entomology, Purdue University, W. Lafayette, IN 47907, United States;Department of Botany and Plant Science, University of California, Riverside, CA 92521, United States;Department of Biochemistry, Purdue University, W. Lafayette, IN 47907, United States;Department of Biochemistry, Purdue University, W. Lafayette, IN 47907, United States;Department of Biochemistry, Purdue University, W. Lafayette, IN 47907, United States

  • Venue:
  • Computational Statistics & Data Analysis
  • Year:
  • 2009

Quantified Score

Hi-index 0.03

Visualization

Abstract

The microarray is an important and powerful tool for prescreening of genes for further research. However, alternative solutions are needed to increase power in small microarray experiments. Use of traditional parametric and even non-parametric tests for such small experiments lack power and have distributional problems. A mixture model is described that is performed directly on expression differences assuming that genes in alternative treatments are expressed or not in all combinations (i) not expressed in either condition, (ii) expressed only under the first condition, (iii) expressed only under the second condition, and (iv) expressed under both conditions, giving rise to 4 possible clusters with two treatments. The approach is termed a Mean-Difference-Mixture-Model (MD-MM) method. Accuracy and power of the MD-MM was compared to other commonly used methods, using both simulations, microarray data, and quantitative real time PCR (qRT-PCR). The MD-MM was found to be generally superior to other methods in most situations. The advantage was greatest in situations where there were few replicates, poor signal to noise ratios, or non-homogeneous variances.