Regulatory component analysis: A semi-blind extraction approach to infer gene regulatory networks with imperfect biological knowledge

  • Authors:
  • Chen Wang;Jianhua Xuan;Ie-Ming Shih;Robert Clarke;Yue Wang

  • Affiliations:
  • Bradley Department of Electrical and Computer Engineering, Virginia Tech, Arlington, VA 22203, USA;Bradley Department of Electrical and Computer Engineering, Virginia Tech, Arlington, VA 22203, USA;Department of Pathology, Johns Hopkins University, Baltimore, MD 21231, USA;Lombardi Comprehensive Cancer Center and Department of Oncology, Physiology and Biophysics, Georgetown University, Washington, DC 20057, USA;Bradley Department of Electrical and Computer Engineering, Virginia Tech, Arlington, VA 22203, USA

  • Venue:
  • Signal Processing
  • Year:
  • 2012

Quantified Score

Hi-index 0.08

Visualization

Abstract

With the advent of high-throughput biotechnology capable of monitoring genomic signals, it becomes increasingly promising to understand molecular cellular mechanisms through systems biology approaches. One of the active research topics in systems biology is to infer gene transcriptional regulatory networks using various genomic data; this inference problem can be formulated as a linear model with latent signals associated with some regulatory proteins called transcription factors (TFs). As common statistical assumptions may not hold for genomic signals, typical latent variable algorithms such as independent component analysis (ICA) are incapable to reveal underlying true regulatory signals. Liao et al. [1] proposed to perform inference using an approach named network component analysis (NCA), the optimization of which is achieved by a least-squares fitting approach with biological knowledge constraints. However, the incompleteness of biological knowledge and its inconsistency with gene expression data are not considered in the original NCA solution, which could greatly affect the inference accuracy. To overcome these limitations, we propose a linear extraction scheme, namely regulatory component analysis (RCA), to infer underlying regulatory signals even with partial biological knowledge. Numerical simulations show a significant improvement of our proposed RCA over NCA, not only when signal-to-noise ratio (SNR) is low but also when the given biological knowledge is incomplete and inconsistent to gene expression data. Furthermore, real biological experiments on Escherichia coli are performed for regulatory network inference in comparison with several typical linear latent variable methods, which again demonstrates the effectiveness and improved performance of the proposed algorithm.