Dissecting network motifs by identifying promoter features that govern differential gene expression
Proceedings of the 2007 Summer Computer Simulation Conference
Targeting differentially co-regulated genes by multiobjective and multimodal optimization
EvoBIO'07 Proceedings of the 5th European conference on Evolutionary computation, machine learning and data mining in bioinformatics
Optimal selection of microarray analysis methods using a conceptual clustering algorithm
EuroGP'06 Proceedings of the 2006 international conference on Applications of Evolutionary Computing
Decision making association rules for recognition of differential gene expression profiles
IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning
A multiobjective evolutionary programming framework for graph-based data mining
Information Sciences: an International Journal
Hi-index | 3.84 |
Motivation: A critical challenge of the post-genomic era is to understand how genes are differentially regulated even when they belong to a given network. Because the fundamental mechanism controlling gene expression operates at the level of transcription initiation, computational techniques have been developed that identify cis regulatory features and map such features into expression patterns to classify genes into distinct networks. However, these methods are not focused on distinguishing between differentially regulated genes within a given network. Here we describe an unsupervised machine learning method, termed GPS for gene promoter scan, that discriminates among co-regulated promoters by simultaneously considering both cis-acting regulatory features and gene expression. GPS is particularly useful for knowledge discovery in environments with reduced datasets and high levels of uncertainty. Results: Application of this method to the enteric bacteria Escherichia coli and Salmonella enterica uncovered novel members, as well as regulatory interactions in the regulon controlled by the PhoP protein that were not discovered using previous approaches. The predictions made by GPS were experimentally validated to establish that the PhoP protein uses multiple mechanisms to control gene transcription, and is a central element in a highly connected network. Availability: The scripts and programs used in this work are accessible from the gps-tools.wustl.edu website. Data and predictions are available by request. Contact: groisman@borcim.wustl.edu Supplementary information: http://gps-tools.wustl.edu/BIOINF-2005-1246R1-Supplemental.pdf