Significance analysis and improved discovery of disease-specific Differentially Co-expressed Gene Sets in microarray data

Authors:
Haixia Li;R. Krishna Murthy Karuturi
Affiliations:
Computational and Mathematical Biology, Genome Institute of Singapore, A-STAR (Agency for Science, Technology and Research), 60 Biopolis Street, S138672, Republic of Singapore.;Computational and Mathematical Biology, Genome Institute of Singapore, A-STAR (Agency for Science, Technology and Research), 60 Biopolis Street, S138672, Republic of Singapore
Venue:
International Journal of Data Mining and Bioinformatics
Year:
2010

Citing 6
Cited 2

Clustering gene expression patterns

RECOMB '99 Proceedings of the third annual international conference on Computational molecular biology
Biclustering of Expression Data

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Enhanced Biclustering on Expression Data

BIBE '03 Proceedings of the 3rd IEEE Symposium on BioInformatics and BioEngineering
Friendly Neighbors Method for Unsupervised Determination of Gene Significance in Time-course Microarray Data

BIBE '04 Proceedings of the 4th IEEE Symposium on Bioinformatics and Bioengineering
Finding disease specific alterations in the co-expression of genes

Bioinformatics
Algorithm to find gene expression profiles of deregulation and identify families of disease-altered genes

Bioinformatics

A heuristic biomarker selection approach based on professional tennis player ranking strategy

Computer Methods and Programs in Biomedicine
Identification of glioma cancer-alerted gene markers based on a diagnostic outcome correlation analysis preferential approach

International Journal of Data Mining and Bioinformatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Kostka and Spang proposed a statistic (KS-statistic) and an algorithm (KS algorithm) to elicit Differentially Co-expressed Gene Sets (DCEGSs) by minimising KS-statistic. We prove that the statistical distributions of KS-statistic under null hypothesis in variance un-normalised and normalised data settings are central and doubly non-central F-distributions, respectively. Based on this analysis, we propose two alternative but equivalent statistics whose null distributions are easier to evaluate. Further, we propose to improve the algorithm by objectively setting the search parameters via maximising the statistical significance of the resultant gene set and pre-filtering the genes by Friendly Neighbours (FNs) algorithm.