Meta-analysis for ranked discovery datasets: Theoretical framework and empirical demonstration for microarrays

  • Authors:
  • Elias Zintzaras;John P. A. Ioannidis

  • Affiliations:
  • Department of Biomathematics, University of Thessaly School of Medicine, Larissa and Department of Informatics, University of Piraeus, Piraeus, Greece and Institute for Clinical Research and Healt ...;Institute for Clinical Research and Health Policy Studies, Tufts-New England Medical Center, Tufts University School of Medicine, Boston, MA, USA and Clinical and Molecular Epidemiology Unit, Depa ...

  • Venue:
  • Computational Biology and Chemistry
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

The combination of results from different large-scale datasets of multidimensional biological signals (such as gene expression profiling) presents a major challenge. Methodologies are needed that can efficiently combine diverse datasets, but can also test the extent of diversity (heterogeneity) across the combined studies. We developed METa-analysis of RAnked DISCovery datasets (METRADISC), a generalized meta-analysis method for combining information across discovery-oriented datasets and for testing between-study heterogeneity for each biological variable of interest. The method is based on non-parametric Monte Carlo permutation testing. The tested biological variables are ranked in each study according to the level of statistical significance. METRADISC tests for each biological variable of interest its average rank and the between-study heterogeneity of the study-specific ranks. After accounting for ties and differences in tested variables across studies, we randomly permute the ranks of each study and the simulated metrics of average rank and heterogeneity are calculated. The procedure is repeated to generate null distributions for the metrics. The use of METRADISC is demonstrated empirically using gene expression data from seven studies comparing prostate cancer cases and normal controls. We offer a new tool for combining complex datasets derived from massive testing, discovery-oriented research and for examining the diversity of results across the combined studies.