Integration of genomic and proteomic data to predict synthetic genetic interactions using semi-supervised learning

  • Authors:
  • Zhuhong You;Shanwen Zhang;Liping Li

  • Affiliations:
  • Intelligent Computing Lab, Hefei Institute of Intelligent Machines, Chinese Academy of Sciences, Hefei, Anhui, China and Department of Automation, University of Science and Technology of China, He ...;Intelligent Computing Lab, Hefei Institute of Intelligent Machines, Chinese Academy of Sciences, Hefei, Anhui, China;The Institute of Soil and Water Conservation of Gansu, Lanzhou, China

  • Venue:
  • ICIC'09 Proceedings of the Intelligent computing 5th international conference on Emerging intelligent computing technology and applications
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Genetic interaction, in which two mutations have a combined effect not exhibited by either mutation alone, is a powerful and widespread tool for establishing functional linkages between genes. However, little is known about how genes genetic interact to produce phenotypes and the comprehensive identification of genetic interaction in genome-scale by experiment is a laborious and time-consuming work. In this paper, we present a computational method of system biology to analyze synthetic genetic interactions. We firstly constructed a high-quality functional gene network by integrating protein interaction, protein complex and microarray gene expression data together. Then we extracted the network properties such as network centrality degree, clustering coefficient, etc., which reflect the local connectivity and global position of a gene and are supposed to correlate with its functional properties. Finally we find relationships between synthetic genetic interactions and function network properties using the graph-based semi-supervised learning which incorporates labeled and unlabeled data together. Experimental results showed that Semi-supervised method outperformed standard supervised learning algorithms and reached 97.1% accuracy at a maximum. Especially, the semi-supervised method largely outperformed when the number of training samples is very small.