Integrative network component analysis for regulatory network reconstruction

  • Authors:
  • Chen Wang;Jianhua Xuan;Li Chen;Po Zhao;Yue Wang;Robert Clarke;Eric P. Hoffman

  • Affiliations:
  • Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA;Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA;Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA;Research Center for Genetic Medicine, Children's National Medical Center, Washington, DC;Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA;Departments of Oncology and Physiology & Biophysics, Georgetown University, School of Medicine, Washington, DC;Research Center for Genetic Medicine, Children's National Medical Center, Washington, DC

  • Venue:
  • ISBRA'08 Proceedings of the 4th international conference on Bioinformatics research and applications
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Network Component Analysis (NCA) has shown its effectiveness inregulator identification by inferring the transcription factor activity (TFA) whenboth microarray data and ChIP-on-chip data are available. However, the NCAscheme is not applicable to many biological studies due to the lack of completeChIP-on-chip data. In this paper, we propose an integrative NCA (iNCA) approachto combine motif information, limited ChIP-on-chip data, and geneexpression data for regulatory network inference. Specifically, a Bayesian frameworkis adopted to develop a novel strategy, namely stability analysis with topologicalsampling, to infer key TFAs and their downstream gene targets. TheiNCA approach with stability analysis reduces the computational cost by avoidinga direct estimation of the high-dimensional distribution in a traditionalBayesian approach. Stability indices are designed to measure the goodness of theestimated TFAs and their connectivity strengths. The approach can also be usedto evaluate the confidence level of different data sources, considering the inevitableinconsistency among the data sources. The iNCA approach has beenapplied to a time course microarray data set of muscle regeneration. The experimentalresults show that iNCA can effectively integrate motif information, ChIP-on-chip data and microarray data to identify key regulators and their gene targetsin muscle regeneration. In particular, several identified TFAs like those ofMyoD, myogenin and YY1 are well supported by biological experiments.