Constructing Suffix Trees On-Line in Linear Time
Proceedings of the IFIP 12th World Computer Congress on Algorithms, Software, Architecture - Information Processing '92, Volume 1 - Volume I
A New Universal Class of Hash Functions and Dynamic Hashing in Real Time
ICALP '90 Proceedings of the 17th International Colloquium on Automata, Languages and Programming
Optimal suffix tree construction with large alphabets
FOCS '97 Proceedings of the 38th Annual Symposium on Foundations of Computer Science
Rapid identification of repeated patterns in strings, trees and arrays
STOC '72 Proceedings of the fourth annual ACM symposium on Theory of computing
Efficient text fingerprinting via Parikh mapping
Journal of Discrete Algorithms
Efficient randomized pattern-matching algorithms
IBM Journal of Research and Development - Mathematics and computing
Journal of Discrete Algorithms
Succinct indexable dictionaries with applications to encoding k-ary trees, prefix sums and multisets
ACM Transactions on Algorithms (TALG)
New algorithms for text fingerprinting
Journal of Discrete Algorithms
Dynamic perfect hashing: upper and lower bounds
SFCS '88 Proceedings of the 29th Annual Symposium on Foundations of Computer Science
Succinct Data Structures for Retrieval and Approximate Membership (Extended Abstract)
ICALP '08 Proceedings of the 35th international colloquium on Automata, Languages and Programming, Part I
Bloomier Filters: A Second Look
ESA '08 Proceedings of the 16th annual European symposium on Algorithms
An Optimal Bloom Filter Replacement Based on Matrix Solving
CSR '09 Proceedings of the Fourth International Computer Science Symposium in Russia on Computer Science - Theory and Applications
A faster query algorithm for the text fingerprinting problem
ESA'07 Proceedings of the 15th annual European conference on Algorithms
Faster query algorithms for the text fingerprinting problem
Information and Computation
New algorithms for text fingerprinting
CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
Hi-index | 0.00 |
Let s=s"1..s"n be a text (or sequence) on a finite alphabet @S of size @s. A fingerprint in s is the set of distinct characters appearing in one of its substrings. The problem considered here is to compute the set F of all fingerprints of all substrings of s in order to answer efficiently certain questions on this set. A substring s"i..s"j is a maximal location for a fingerprint f@?F (denoted by ) if the alphabet of s"i..s"j is f and s"i"-"1, s"j"+"1, if defined, are not in f. The set of maximal locations in s is L (it is easy to see that |L|= and such that s"i..s"j=s"k..s"l are named copies, and the quotient set of L according to the copy relation is denoted by L"C. We first present new exact efficient algorithms and data structures for the following three problems: (1) to compute F; (2) given f as a set of distinct characters in @S, to answer if f represents a fingerprint in F; (3) given f, to find all maximal locations of f in s. As well as in papers concerning succinct data structures, in the paper all space complexities are counted in bits. Problem 1 is solved either in O(n+|L"C|log@s) worst-case time (in this paper all logarithms are intended as base two logarithms) using O((n+|L"C|+|F|log@s)logn) bits of space, or in O(n+|L|log@s) randomized expected time using O((n+|F|log@s)logn) bits of space. Problem 2 is solved either in O(|f|) expected time if only O(|f|logn) bits of working space for queries is allowed, or in worst-case O(|f|/@e) time if a working space of O(@s^@elogn) bits is allowed (with @e a constant satisfying 0