A model-based optimization framework for the inference on gene regulatory networks from DNA array data

  • Authors:
  • Reuben Thomas;Sanjay Mehrotra;Eleftherios T. Papoutsakis;Vassily Hatzimanikatis

  • Affiliations:
  • Department of Industrial Engineering and Management Science;Department of Industrial Engineering and Management Science;Department of Chemical and Biological Engineering, 2145 Sheridan Road, E136, Northwestern University, Evanston, IL 60208-3120, USA;Department of Chemical and Biological Engineering, 2145 Sheridan Road, E136, Northwestern University, Evanston, IL 60208-3120, USA

  • Venue:
  • Bioinformatics
  • Year:
  • 2004

Quantified Score

Hi-index 3.85

Visualization

Abstract

Motivation: Identification of the regulatory structures in genetic networks and the formulation of mechanistic models in the form of wiring diagrams is one of the significant objectives of expression profiling using DNA microarray technologies and it requires the development and application of identification frameworks. Results: We have developed a novel optimization framework for identifying regulation in a genetic network using the S-system modeling formalism. We show that balance equations on both mRNA and protein species led to a formulation suitable for analyzing DNA-microarray data whereby protein concentrations have been eliminated and only mRNA relative concentrations are retained. Using this formulation, we examined if it is possible to infer a set of possible genetic regulatory networks consistent with observed mRNA expression patterns. Two origins of changes in mRNA expression patterns were considered. One derives from changes in the biophysical properties of the system that alter the molecular-interaction kinetics and/or message stability. The second is due to gene knock-outs. We reduced the identification problem to an optimization problem (of the so-called mixed-integer non-linear programming class) and we developed an algorithmic procedure for solving this optimization problem. Using simulated data generated by our mathematical model, we show that our method can actually find the regulatory network from which the data were generated. We also show that the number of possible alternate genetic regulatory networks depends on the size of the dataset (i.e. number of experiments), but this dependence is different for each of the two types of problems considered, and that a unique solution requires fewer datasets than previously estimated in the literature. This is the first method that also allows the identification of every possible regulatory network that could explain the data, when the number of experiments does not allow identification of unique regulatory structure. Availability: The implementation of the algorithm in AMPL is available on request from the authors.