Empirical evaluation of consistency and accuracy of methods to detect differentially expressed genes based on microarray data

  • Authors:
  • Dake Yang;Rudolph S. Parrish;Guy N. Brock

  • Affiliations:
  • -;-;-

  • Venue:
  • Computers in Biology and Medicine
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

Background: In this study, we empirically evaluated the consistency and accuracy of five different methods to detect differentially expressed genes (DEGs) based on microarray data. Methods: Five different methods were compared, including the t-test, significance analysis of microarrays (SAM), the empirical Bayes t-test (eBayes), t-tests relative to a threshold (TREAT), and assumption adequacy averaging (AAA). The percentage of overlapping genes (POG) and the percentage of overlapping genes related (POGR) scores were used to rank the different methods on their ability to maintain a consistent list of DEGs both within the same data set and across two different data sets concerning the same disease. The power of each method was evaluated based on a simulation approach which mimics the multivariate distribution of the original microarray data. Results: For smaller sample sizes (6 or less per group), moderated versions of the t-test (SAM, eBayes, and TREAT) were superior in terms of both power and consistency relative to the t-test and AAA, with TREAT having the highest consistency in each scenario. Differences in consistency were most pronounced for comparisons between two different data sets for the same disease. For larger sample sizes AAA had the highest power for detecting small effect sizes, while TREAT had the lowest. Discussion: For smaller sample sizes moderated versions of the t-test can generally be recommended, while for larger sample sizes selection of a method to detect DEGs may involve a compromise between consistency and power.