New algorithms for text fingerprinting

  • Authors:
  • Roman Kolpakov;Mathieu Raffinot

  • Affiliations:
  • Liapunov French-Russian Institute, Lomonosov Moscow State University, Moscow, Russia;CNRS, Poncelet Laboratory, Independent University of Moscow, 11 street Bolchoï Vlassievski, 119 002 Moscow, Russia

  • Venue:
  • Journal of Discrete Algorithms
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Let s=s"1..s"n be a text (or sequence) on a finite alphabet @S. A fingerprint in s is the set of distinct characters contained in one of its substrings. Fingerprinting a text consists of computing the set F of all fingerprints of all its substrings and being able to efficiently answer several questions on this set. A given fingerprint f@?F is represented by a binary array, F, of size |@S| named a fingerprint table. A fingerprint, f@?F, admits a number of maximal locations in S, that is the alphabet of s"i..s"j is f and s"i"-"1,s"j"+"1, if defined, are not in f. The set of maximal locations is L,|L|=