Moitf GibbsGA: Sampling Transcription Factor Binding Sites Coupled with PSFM Optimization by GA

  • Authors:
  • Lifang Liu;Licheng Jiao

  • Affiliations:
  • School of Computer Science and Technology, Xidian University, Xi'an, China 710071 and Institute of Intelligent Information Processing, Xidian University, Xi`an, China 710071;Institute of Intelligent Information Processing, Xidian University, Xi`an, China 710071

  • Venue:
  • ISICA '09 Proceedings of the 4th International Symposium on Advances in Computation and Intelligence
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Identification of transcription factor binding sites (TFBSs) or motifs plays an important role in deciphering the mechanisms of gene regulation. Although many experimental and computational methods have been developed, finding TFBSs remains a challenging problem. We propose and develop a novel sampling based motif finding method coupled with PSFM optimization by genetic algorithm, which we call Motif GibbsGA. One significant feature of Motif GibbsGA is the combination of Gibbs sampling and PSFM optimization by genetic algorithm. Based on position-specific frequency matrix (PSFM) motif model, a greedy strategy for choosing the initial parameters of PSFM is employed. Then a Gibbs sampler is built with respect to PSFM model. During the sampling process, PSFM is improved via a genetic algorithm. A post-processing with adaptive adding and removing is used to handle general cases with arbitrary numbers of instances per sequence. We test our method on the benchmark dataset compiled by Tompa et al. for assessing computational tools that predict TFBSs. The performance of Motif GibbsGA on the data set compares well to, and in many cases exceeds, the performance of existing tools. This is in part attributed to the significant role played by the genetic algorithm which has improved PSFM.