Mixed-membership naive Bayes models
Data Mining and Knowledge Discovery
Hierarchical generative biclustering for MicroRNA expression analysis
RECOMB'10 Proceedings of the 14th Annual international conference on Research in Computational Molecular Biology
Exploiting the Functional and Taxonomic Structure of Genomic Data by Probabilistic Topic Modeling
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Hi-index | 3.84 |
Motivation: In haploinsufficiency profiling data, pleiotropic genes are often misclassified by clustering algorithms that impose the constraint that a gene or experiment belong to only one cluster. We have developed a general probabilistic model that clusters genes and experiments without requiring that a given gene or drug only appear in one cluster. The model also incorporates the functional annotation of known genes to guide the clustering procedure. Results: We applied our model to the clustering of 79 chemogenomic experiments in yeast. Known pleiotropic genes PDR5 and MAL11 are more accurately represented by the model than by a clustering procedure that requires genes to belong to a single cluster. Drugs such as miconazole and fenpropimorph that have different targets but similar off-target genes are clustered more accurately by the model-based framework. We show that this model is useful for summarizing the relationship among treatments and genes affected by those treatments in a compendium of microarray profiles. Availability: Supplementary information and computer code at http://genomics.lbl.gov/llda Contact: flaherty@berkeley.edu