Permutation test for incomplete paired data with application to cDNA microarray data

Authors:
Donghyeon Yu;Johan Lim;Feng Liang;Kyunga Kim;Byung Soo Kim;Woncheol Jang
Affiliations:
Department of Statistics, Seoul National University, Seoul, Republic of Korea;Department of Statistics, Seoul National University, Seoul, Republic of Korea;Department of Statistics, University of Illinois at Urbana-Champaign, Urbana-Champaign, IL, USA;Department of Statistics, Sookmyung Women's University, Seoul, Republic of Korea;Department of Applied Statistics, Yonsei University, Seoul, Republic of Korea;Department of Epidemiology and Biostatistics, University of Georgia, 30602 Athens, GA, USA
Venue:
Computational Statistics & Data Analysis
Year:
2012

Citing 6
Cited 0

The Minimum Error Minimax Probability Machine

The Journal of Machine Learning Research
A Bayesian approach to reconstructing genetic regulatory networks with hidden factors

Bioinformatics
Statistical methods of translating microarray data into clinically relevant diagnostic information in colorectal cancer

Bioinformatics
Pareto optimal linear classification

ICML '06 Proceedings of the 23rd international conference on Machine learning
Inferring gene regulatory networks from multiple microarray datasets

Bioinformatics
To permute or not to permute

Bioinformatics

Quantified Score

Hi-index	0.04

Visualization

Abstract

A paired data set is common in microarray experiments, where the data are often incompletely observed for some pairs due to various technical reasons. In microarray paired data sets, it is of main interest to detect differentially expressed genes, which are usually identified by testing the equality of means of expressions within a pair. While much attention has been paid to testing mean equality with incomplete paired data in previous literature, the existing methods commonly assume the normality of data or rely on the large sample theory. In this paper, we propose a new test based on permutations, which is free from the normality assumption and large sample theory. We consider permutation statistics with linear mixtures of paired and unpaired samples as test statistics, and propose a procedure to find the optimal mixture that minimizes the conditional variances of the test statistics, given the observations. Simulations are conducted for numerical power comparisons between the proposed permutation tests and other existing methods. We apply the proposed method to find differentially expressed genes for a colorectal cancer study.