Discovering biological motifs with genetic programming

  • Authors:
  • Rolv Seehuus;Amund Tveit;Ole Edsberg

  • Affiliations:
  • Norwegian University of Science and Technology, Trondheim, NORWAY;Norwegian University of Science and Technology, Trondheim, NORWAY;Norwegian University of Science and Technology, Trondheim, NORWAY

  • Venue:
  • GECCO '05 Proceedings of the 7th annual conference on Genetic and evolutionary computation
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Choosing the right representation for a problem is important. In this article we introduce a linear genetic programming approach for motif discovery in protein families, and we also present a thorough comparison between our approach and Koza-style genetic programming using ADFs. In a study of 45 protein families, we demonstrate that our algorithm, given equal processing resources and no prior knowledge in shaping of datasets, consistently generates motifs that are of significantly better quality than those we found by using trees as representation. For several of the studied protein families we evolve motifs comparable to those found in Prosite, a manually curated database of protein motifs.Our linear genome gave better results than Koza-style genetic programming for 37 of 45 families. The difference is statistically significant for 24 of the families at the 99% confidence level.