Maximal and minimal representations of gapped and non-gapped motifs of a string

  • Authors:
  • Esko Ukkonen

  • Affiliations:
  • Helsinki Institute for Information Technology, Helsinki University of Technology and University of Helsinki, P.O. Box 68, FIN-00014 University of Helsinki, Finland

  • Venue:
  • Theoretical Computer Science
  • Year:
  • 2009

Quantified Score

Hi-index 5.23

Visualization

Abstract

The problems of finding maximal and minimal equivalent representations for gapped and non-gapped motifs as well as finding motifs that characterize a fixed set of occurrence locations for a given string are studied. We apply two equivalence relations on representations. The first one is the well-known occurrence-equivalence of motifs. The second equivalence is introduced for patterns of occurrence locations, to characterize such patterns by motifs. For both equivalences, quadratic-time algorithms are given for finding a maximal representative of an equivalence class. Finding a minimal representative is shown to be NP-complete in both cases. For non-gapped motifs suffix-tree-based linear-time algorithms are given for finding maximal and minimal representatives. Maximal (minimal) gapped motifs are composed of blocks that are maximal (minimal) non-gapped motifs, maximal and minimal non-gapped motifs thus making up a small basis for all motifs. The implied bound on the number of gapped motifs that have a fixed number of non-gapped blocks is also given.