Extracting approximate patterns

  • Authors:
  • Johann Pelfrêne;Saïd Abdeddaïm;Joël Alexandre

  • Affiliations:
  • ExonHit Therapeutics, Boulevard Masséna, Paris and ABISS, UMR, CNRS, Université de Rouen, Mont Saint Aignan;ABISS, LIFAR, Université de Rouen, Mont Saint Aignan;ABISS, UMR, CNRS, Université de Rouen, Mont Saint Aignan

  • Venue:
  • CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

In a sequence, approximate patterns are exponential in number. In this paper, we present a new notion of basis for the patterns with don't cares occurring in a given text (sequence). The primitive patterns are of interest since their number is lower than previous known definitions (and in a case, sub-linear in the size of the text), and these patterns can be used to extract all the patterns of a text. We present an incremental algorithm that computes the primitive patterns occurring at least q times in a text of length n, given the N primitive patterns occurring at least q-1 times, in time O(|Σ|Nn2 log2 n log log n). In the particular case where q = 2, the complexity in time is only O(|Σ|n2 log2 n log log n). We also give an algorithm that decides if a given pattern is primitive in a given text.