The L1-version of the Cramér-von mises test for two-sample comparisons in microarray data analysis

  • Authors:
  • Yuanhui Xiao;Alexander Gordon;Andrei Yakovlev

  • Affiliations:
  • Department of Biostatistics and Computational Biology, University of Rochester, Rochester, NY and Department of Mathematics and Statistics, Georgia State University, Atlanta, GA;Department of Biostatistics and Computational Biology, University of Rochester, Rochester, NY and Department of Mathematics and Statistics, University of North Carolina at Charlotte, Charlotte, NC;Department of Biostatistics and Computational Biology, University of Rochester, Rochester, NY

  • Venue:
  • EURASIP Journal on Bioinformatics and Systems Biology
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Distribution-free statistical tests offer clear advantages in situations where the exact unadjusted p-values are required as input for multiple testing procedures. Such situations prevail when testing for differential expression of genes in microarray studies. The Cramér-von Mises two-sample test, based on a certain L2-distance between two empirical distribution functions, is a distribution-free test that has proven itself as a good choice. A numerical algorithm is available for computing quantiles of the sampling distribution of the Cramér-von Mises test statistic in finite samples. However, the computation is very time-and space-consuming. An L1 counterpart of the Cramér-von Mises test represents an appealing alternative. In this work, we present an efficient algorithm for computing exact quantiles of the L1-distance test statistic. The performance and power of the L1-distance test are compared with those of the Cramér-von Mises and two other classical tests, using both simulated data and a large set of microarray data on childhood leukemia. The L1-distance test appears to be nearly as powerful as its L2 counterpart. The lower computational intensity of the L1-distance test allows computation of exact quantiles of the null distribution for larger sample sizes than is possible for the Cramér-von Mises test.