Faster query algorithms for the text fingerprinting problem

Authors:
Chi-Yuan Chan;Hung-I Yu;Wing-Kai Hon;Biing-Feng Wang
Affiliations:
Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan 30043, ROC;Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan 30043, ROC;Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan 30043, ROC;Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan 30043, ROC
Venue:
Information and Computation
Year:
2011

Citing 13
Cited 1

Fast algorithms for finding nearest common ancestors

SIAM Journal on Computing
Storing a Sparse Table with 0(1) Worst Case Access Time

Journal of the ACM (JACM)
On finding lowest common ancestors: simplification and parallelization

SIAM Journal on Computing
The string B-tree: a new data structure for string search in external memory and its applications

Journal of the ACM (JACM)
Deterministic dictionaries

Journal of Algorithms
Constraint Grammar: A Language-Independent System for Parsing Unrestricted Text

Constraint Grammar: A Language-Independent System for Parsing Unrestricted Text
The LCA Problem Revisited

LATIN '00 Proceedings of the 4th Latin American Symposium on Theoretical Informatics
Efficient text fingerprinting via Parikh mapping

Journal of Discrete Algorithms
Character sets of strings

Journal of Discrete Algorithms
Improved approximate common interval

Information Processing Letters
New algorithms for text fingerprinting

Journal of Discrete Algorithms
Efficient computation of approximate gene clusters based on reference occurrences

RECOMB-CG'10 Proceedings of the 2010 international conference on Comparative genomics
New algorithms for text fingerprinting

CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching

Various improvements to text fingerprinting

Journal of Discrete Algorithms

Quantified Score

Hi-index	0.00

Visualization

Abstract

Let S be a string over a finite, ordered alphabet @S. For any substring S^' of S, the set of distinct characters contained in S^' is called its fingerprint. The text fingerprinting indexing problem is to construct a data structure for the string S in advance, so that on given any input subset C of @S, we can answer the following queries efficiently: (1) determine if C represents a fingerprint of some substrings in S; (2) find all maximal substrings of S whose fingerprint is C. The best known results solved these two queries in @Q(|@S|) and @Q(|@S|+K) time, respectively, where K is the number of maximal substrings. In this paper, we propose two improved algorithms for the text fingerprinting indexing problem. The first one solves the two queries in O(|C|logn) and O(|C|logn+K) time, respectively. For the second one, the query time complexities are further reduced to O(|C|log(|@S|/|C|)) and O(|C|log(|@S|/|C|)+K). Both results answer an open problem proposed by Amir et al.