An integrative approach for genomic island prediction in Prokaryotic genomes

  • Authors:
  • Han Wang;John Fazekas;Matthew Booth;Qi Liu;Dongsheng Che

  • Affiliations:
  • Department of Computer Science, East Stroudsburg University, East Stroudsburg, PA;Department of Computer Science, East Stroudsburg University, East Stroudsburg, PA;Department of Computer Science, East Stroudsburg University, East Stroudsburg, PA;College of Life Science and Biotechnology, Tongji University, Shanghai, P.R. China;Department of Computer Science, East Stroudsburg University, East Stroudsburg, PA

  • Venue:
  • ISBRA'11 Proceedings of the 7th international conference on Bioinformatics research and applications
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

A genomic island (GI) is a segment of genomic sequence that is horizontally transferred from other genomes. The detection of genomic islands is extremely important to the medical research. Most of current computational approaches that use sequence composition to predict genomic islands have the problem of low prediction accuracy. In this paper, we report, for the first time, that gene information and inter-genic distance are different between genomic islands and non-genomic islands. Using these two sources and sequence information, we have trained the genomic island datasets from 113 genomes, and developed a decisiontree based bagging model for genomic island prediction. In order to test the performance our approach, we have applied it on three genomes: Salmonella typhimurium LT2, Streptococcus pyogenes MGAS315, and Escherichia coli O157:H7 str. Sakai. The performance metrics have shown that our approach is better than other sequence composition based approaches. We conclude that the incorporation of gene information and intergenic distance could improve genomic island prediction accuracy. Our prediction software, Genomic Island Hunter (GIHunter), is available at http://www.esu.edu/cpsc/che_lab/software/GIHunter.