Methodological Review: Towards knowledge-based gene expression data mining
Journal of Biomedical Informatics
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Gene Ontology Assisted Exploratory Microarray Clustering and Its Application to Cancer
PRIB '08 Proceedings of the Third IAPR International Conference on Pattern Recognition in Bioinformatics
Fuzzy c-means clustering with prior biological knowledge
Journal of Biomedical Informatics
Formulating and testing hypotheses in functional genomics
Artificial Intelligence in Medicine
Computational Statistics & Data Analysis
Using Gene Ontology annotations in exploratory microarray clustering to understand cancer etiology
Pattern Recognition Letters
Hi-index | 3.84 |
Motivation: Cluster analysis of gene expression profiles has been widely applied to clustering genes for gene function discovery. Many approaches have been proposed. The rationale is that the genes with the same biological function or involved in the same biological process are more likely to co-express, hence they are more likely to form a cluster with similar gene expression patterns. However, most existing methods, including model-based clustering, ignore known gene functions in clustering. Results: To take advantage of accumulating gene functional annotations, we propose incorporating known gene functions as prior probabilities in model-based clustering. In contrast to a global mixture model applicable to all the genes in the standard model-based clustering, we use a stratified mixture model: one stratum corresponds to the genes of unknown function while each of the other ones corresponding to the genes sharing the same biological function or pathway; the genes from the same stratum are assumed to have the same prior probability of coming from a cluster while those from different strata are allowed to have different prior probabilities of coming from the same cluster. We derive a simple EM algorithm that can be used to fit the stratified model. A simulation study and an application to gene function prediction demonstrate the advantage of our proposal over the standard method. Contact: weip@biostat.umn.edu