Faster query algorithms for the text fingerprinting problem

  • Authors:
  • Chi-Yuan Chan;Hung-I Yu;Wing-Kai Hon;Biing-Feng Wang

  • Affiliations:
  • Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan 30043, ROC;Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan 30043, ROC;Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan 30043, ROC;Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan 30043, ROC

  • Venue:
  • Information and Computation
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Let S be a string over a finite, ordered alphabet @S. For any substring S^' of S, the set of distinct characters contained in S^' is called its fingerprint. The text fingerprinting indexing problem is to construct a data structure for the string S in advance, so that on given any input subset C of @S, we can answer the following queries efficiently: (1) determine if C represents a fingerprint of some substrings in S; (2) find all maximal substrings of S whose fingerprint is C. The best known results solved these two queries in @Q(|@S|) and @Q(|@S|+K) time, respectively, where K is the number of maximal substrings. In this paper, we propose two improved algorithms for the text fingerprinting indexing problem. The first one solves the two queries in O(|C|logn) and O(|C|logn+K) time, respectively. For the second one, the query time complexities are further reduced to O(|C|log(|@S|/|C|)) and O(|C|log(|@S|/|C|)+K). Both results answer an open problem proposed by Amir et al.