Analysis of differentially-regulated genes within a regulatory network by GPS genome navigation

  • Authors:
  • Igor Zwir;Henry Huang;Eduardo A. Groisman

  • Affiliations:
  • Department of Molecular Microbiology, Howard Hughes Medical Institute, Washington University School of Medicine Campus Box 8230, 660 S. Euclid Avenue, St Louis, MO 63110, USA;Department of Molecular Microbiology, Howard Hughes Medical Institute, Washington University School of Medicine Campus Box 8230, 660 S. Euclid Avenue, St Louis, MO 63110, USA;Department of Molecular Microbiology, Howard Hughes Medical Institute, Washington University School of Medicine Campus Box 8230, 660 S. Euclid Avenue, St Louis, MO 63110, USA

  • Venue:
  • Bioinformatics
  • Year:
  • 2005

Quantified Score

Hi-index 3.84

Visualization

Abstract

Motivation: A critical challenge of the post-genomic era is to understand how genes are differentially regulated even when they belong to a given network. Because the fundamental mechanism controlling gene expression operates at the level of transcription initiation, computational techniques have been developed that identify cis regulatory features and map such features into expression patterns to classify genes into distinct networks. However, these methods are not focused on distinguishing between differentially regulated genes within a given network. Here we describe an unsupervised machine learning method, termed GPS for gene promoter scan, that discriminates among co-regulated promoters by simultaneously considering both cis-acting regulatory features and gene expression. GPS is particularly useful for knowledge discovery in environments with reduced datasets and high levels of uncertainty. Results: Application of this method to the enteric bacteria Escherichia coli and Salmonella enterica uncovered novel members, as well as regulatory interactions in the regulon controlled by the PhoP protein that were not discovered using previous approaches. The predictions made by GPS were experimentally validated to establish that the PhoP protein uses multiple mechanisms to control gene transcription, and is a central element in a highly connected network. Availability: The scripts and programs used in this work are accessible from the gps-tools.wustl.edu website. Data and predictions are available by request. Contact: groisman@borcim.wustl.edu Supplementary information: http://gps-tools.wustl.edu/BIOINF-2005-1246R1-Supplemental.pdf