Significance analysis and improved discovery of disease-specific Differentially Co-expressed Gene Sets in microarray data

  • Authors:
  • Haixia Li;R. Krishna Murthy Karuturi

  • Affiliations:
  • Computational and Mathematical Biology, Genome Institute of Singapore, A-STAR (Agency for Science, Technology and Research), 60 Biopolis Street, S138672, Republic of Singapore.;Computational and Mathematical Biology, Genome Institute of Singapore, A-STAR (Agency for Science, Technology and Research), 60 Biopolis Street, S138672, Republic of Singapore

  • Venue:
  • International Journal of Data Mining and Bioinformatics
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Kostka and Spang proposed a statistic (KS-statistic) and an algorithm (KS algorithm) to elicit Differentially Co-expressed Gene Sets (DCEGSs) by minimising KS-statistic. We prove that the statistical distributions of KS-statistic under null hypothesis in variance un-normalised and normalised data settings are central and doubly non-central F-distributions, respectively. Based on this analysis, we propose two alternative but equivalent statistics whose null distributions are easier to evaluate. Further, we propose to improve the algorithm by objectively setting the search parameters via maximising the statistical significance of the resultant gene set and pre-filtering the genes by Friendly Neighbours (FNs) algorithm.