Trade-offs using GAMID for the inference of DNA motifs that are represented in only a subset of sequences of interest

  • Authors:
  • Jeffrey Thompson

  • Affiliations:
  • University of Southern Maine, Portland, ME, USA

  • Venue:
  • Proceedings of the 14th annual conference companion on Genetic and evolutionary computation
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In prior work, we presented GAMID, an extension of GAMI (Genetic Algorithms for Motif Inference), which allows the system to ignore some of the sequences when looking for candidate conserved motifs in noncoding DNA. This ability is useful both when looking for candidate motifs in co-expressed genes (where it is not expected that all genes respond to the same transcription factors) and when looking for candidate motifs in divergent species (where functional elements might appear only in related species). In these cases, we would like to allow the inferred motif to be present in only a subset of the input data. By excluding some sequences from the match process, GAMID succeeded at finding known functional elements. Here we use the results of experiments using artificial data with GAMID to show that GAMID's success in inferring motifs in subsets of the input data results in it finding fewer motifs when they are present in all the sequences. Therefore, GAMID is useful as an adjunct tool to GAMI, but is not a replacement for its functionality.