A latent variable model for chemogenomic profiling

  • Authors:
  • Patrick Flaherty;Guri Giaever;Jochen Kumm;Michael I. Jordan;Adam P. Arkin

  • Affiliations:
  • Department of Electrical Engineering and Computer Science, University of California, Berkeley, CA 94720, USA;Stanford Genome Technology Center, Stanford University School of Medicine Palo Alto, CA 94304, USA;Stanford Genome Technology Center, Stanford University School of Medicine Palo Alto, CA 94304, USA;Division of Computer Science, Department of Statistics, University of California Berkeley, CA 94720, USA;Department of Bioengineering, University of California and Physical Biosciences Division, Lawrence Berkeley National Laboratory, Howard Hughes Medical Institute Berkeley, CA 94720, USA

  • Venue:
  • Bioinformatics
  • Year:
  • 2005

Quantified Score

Hi-index 3.84

Visualization

Abstract

Motivation: In haploinsufficiency profiling data, pleiotropic genes are often misclassified by clustering algorithms that impose the constraint that a gene or experiment belong to only one cluster. We have developed a general probabilistic model that clusters genes and experiments without requiring that a given gene or drug only appear in one cluster. The model also incorporates the functional annotation of known genes to guide the clustering procedure. Results: We applied our model to the clustering of 79 chemogenomic experiments in yeast. Known pleiotropic genes PDR5 and MAL11 are more accurately represented by the model than by a clustering procedure that requires genes to belong to a single cluster. Drugs such as miconazole and fenpropimorph that have different targets but similar off-target genes are clustered more accurately by the model-based framework. We show that this model is useful for summarizing the relationship among treatments and genes affected by those treatments in a compendium of microarray profiles. Availability: Supplementary information and computer code at http://genomics.lbl.gov/llda Contact: flaherty@berkeley.edu