Fewer permutations, more accurate P-values

Authors:
Theo A. Knijnenburg;Lodewyk F. A. Wessels;Marcel J. T. Reinders;Ilya Shmulevich
Affiliations:
-;-;-;-
Venue:
Bioinformatics
Year:
2009

Citing 0
Cited 3

Monte Carlo randomization tests for large-scale abundance datasets on the GPU

Computer Methods and Programs in Biomedicine
Permutation test for groups of scanpaths using normalized Levenshtein distances and application in NMR questions

Proceedings of the Symposium on Eye Tracking Research and Applications
Tree-space statistics and approximations for large-scale analysis of anatomical trees

IPMI'13 Proceedings of the 23rd international conference on Information Processing in Medical Imaging

Quantified Score

Hi-index	3.84

Visualization

Abstract

Motivation: Permutation tests have become a standard tool to assess the statistical significance of an event under investigation. The statistical significance, as expressed in a P-value, is calculated as the fraction of permutation values that are at least as extreme as the original statistic, which was derived from non-permuted data. This empirical method directly couples both the minimal obtainable P-value and the resolution of the P-value to the number of permutations. Thereby, it imposes upon itself the need for a very large number of permutations when small P-values are to be accurately estimated. This is computationally expensive and often infeasible. Results: A method of computing P-values based on tail approximation is presented. The tail of the distribution of permutation values is approximated by a generalized Pareto distribution. A good fit and thus accurate P-value estimates can be obtained with a drastically reduced number of permutations when compared with the standard empirical way of computing P-values. Availability: The Matlab code can be obtained from the corresponding author on request. Contact: tknijnenburg@systemsbiology.org Supplementary information: Supplementary data are available at Bioinformatics online.