The Minimum Error Minimax Probability Machine
The Journal of Machine Learning Research
Pareto optimal linear classification
ICML '06 Proceedings of the 23rd international conference on Machine learning
Bioinformatics
Hi-index | 0.04 |
A paired data set is common in microarray experiments, where the data are often incompletely observed for some pairs due to various technical reasons. In microarray paired data sets, it is of main interest to detect differentially expressed genes, which are usually identified by testing the equality of means of expressions within a pair. While much attention has been paid to testing mean equality with incomplete paired data in previous literature, the existing methods commonly assume the normality of data or rely on the large sample theory. In this paper, we propose a new test based on permutations, which is free from the normality assumption and large sample theory. We consider permutation statistics with linear mixtures of paired and unpaired samples as test statistics, and propose a procedure to find the optimal mixture that minimizes the conditional variances of the test statistics, given the observations. Simulations are conducted for numerical power comparisons between the proposed permutation tests and other existing methods. We apply the proposed method to find differentially expressed genes for a colorectal cancer study.