Using knowledge driven matrix factorization to reconstruct modular gene regulatory network

  • Authors:
  • Yang Zhou;Zheng Li;Xuerui Yang;Linxia Zhang;Shireesh Srivastava;Rong Jin;Christina Chan

  • Affiliations:
  • Department of Computer Science and Engineering, Michigan State University, East Lansing, MI;Department of Chemical Engineering and Materials Science, Michigan State University, East Lansing, MI;Department of Chemical Engineering and Materials Science, Michigan State University, East Lansing, MI;Department of Chemical Engineering and Materials Science, Michigan State University, East Lansing, MI;Department of Chemical Engineering and Materials Science, Michigan State University, East Lansing, MI;Department of Computer Science and Engineering, Michigan State University, East Lansing, MI;Department of Computer Science and Engineering, Michigan State University, East Lansing, MI and Department of Chemical Engineering and Materials Science

  • Venue:
  • AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Reconstructing gene networks from micro-array data can provide information on the mechanisms that govern cellular processes. Numerous studies have been devoted to addressing this problem. A popular method is to view the gene network as a Bayesian inference network, and to apply structure learning methods to determine the topology of the gene network. There are, however, several shortcomings with the Bayesian structure learning approach for reconstructing gene networks. They include high computational cost associated with analyzing a large number of genes and inefficiency in exploiting prior knowledge of co-regulation that could be derived from Gene Ontology (GO) information. In this paper, we present a knowledge driven matrix factorization (KMF) framework for reconstructing modular gene networks that addresses these shortcomings. In KMF, gene expression data is initially used to estimate the correlation matrix. The gene modules and the interactions among the modules are derived by factorizing the correlation matrix. The prior knowledge in GO is integrated into matrix factorization to help identify the gene modules. An alternating optimization algorithm is presented to efficiently find the solution. Experiments show that our algorithm performs significantly better in identifying gene modules than several state-of-the-art algorithms, and the interactions among the modules uncovered by our algorithm are proved to be biologically meaningful.